Language key building block in quest for mechanical mind

Somewhere in a distant future lies the ultimate goal: machines that can learn and act in ways that demonstrate cognitive functioning. Realizing that goal may be many years away, but that hasn't stopped Beckman Institute for Advanced Science and Technology researcher Stephen Levinson from dedicating more than two decades to a concept that went from science fiction fantasy to a scientific possibility with the dawn of the computer age. In fact, it's been his driving research interest since graduate school and during a career at Bell Labs.

Somewhere in a distant future lies the ultimate goal: machines that can learn and act in ways that demonstrate cognitive functioning.

Realizing that goal may be many years away, but that hasn't stopped Beckman Institute for Advanced Science and Technology researcher Stephen Levinson from dedicating more than two decades to a concept that went from science fiction fantasy to a scientific possibility with the dawn of the computer age. In fact, it's been his driving research interest since graduate school and during a career at Bell Labs.

Levinson is head of the Language Acquisition and Robotics group at Beckman and Professor of Electrical and Computer Engineering at the University of Illinois. Since coming to Beckman from Bell Labs in 1997, his work has centered on building a robot that simulates human cognitive function, with a focus on what Levinson says is the key to understanding cognition: language.

"In the adult human everything we do and everything we know is mediated linguistically," Levinson said. "My argument is that the core of our adult intelligence is language, and therefore it is an important thing to explore. Certainly in the adult human it's very difficult to tease apart language and other cognitive function."

Despite the challenges, the goal of Levinson and his students is to construct a mechanical "mind" by developing modules for sensory input processing, speech recognition and generation, navigation, and associative learning. They are concentrating on teaching language to their two most recent robotic creations, named Illy and Norbert. It is no small task.

"It's a model for the human mind. It's nothing less than that," Levinson said of the project. "And it says that you can't construct it symbolically from first principles. It has to be learned and acquired by interaction with the real world. That's what children do and we're trying to emulate that."

Levinson said the project faces many difficulties with no guarantees of ultimate success. "We're at the very earliest stages. I could make an argument that this is the most difficult scientific challenge we scientists face."

Levinson discusses the robots and other related topics in his book, "Mathematical Models for Speech Technology" released earlier this year.

"What I say in my book is that this is a very humble beginning to a very long-term research project," Levinson said. "It is arguably one of the most difficult, if not the very most difficult, challenges that science faces anywhere."

Levinson said science is beginning to understand biological function and "this is beyond that. What we're asking about is psychological function and very little is known and very little is understood. Perhaps the way to get at it is this: in the lab, we are very, very happy if the robot does anything at all interesting and unexpected. That's a success and that's exciting."

The successes include the hardware (the chassis were built by Arrick Robotics and heavily modified), the software (the programs were all written by Levinson and his students), and a few, small first steps taken by the robots toward learning.

Levinson said one of the early successes was teaching the robot both English and Chinese - because there were English and Chinese students involved in the project.

"Another very exciting outcome was the visual navigation," he said. "The fact that the robot could see an object and could figure out where it was, and could navigate over there and pick it up. It learned on its own. You could program a robot to do it; even to program it would not be trivial. But this is learned behavior. So that was very exciting. One of the nice things was when it went to grab the object, it would back up and try again. It's all learned behavior."

That concept of learned behavior is what sets this project apart from previous machine learning experiments or robotic behavior based on artificial, or programmed, intelligence.

"Artificial intelligence (AI) says that a computer is a sufficiently powerful device, so that it can emulate cognitive process and it can be programmed directly by people," Levinson said. "Now that contrasts with what we're saying. We're saying that there may be some things that are built in (to the brain), but by and large, the human brain learns (during) the life of the individual how to do whatever it is that we people do."

Levinson said that approach ties in with a general theme of Beckman research into cognition that says the brain is malleable, adaptable and changes through real-life experience. He believes the project serves as a nice confluence between Beckman's artificial intelligence and biological intelligence research areas.

"One of the major reasons why I came here is because it was such an absolutely perfect fit for Beckman," Levinson said. "People here were very enthusiastic about it. When I came here in 1997 I began working on it and this was sort of the fulfillment of an idea that I had hatched earlier but was unable to pursue."

Levinson developed the first prototype, Alan, while working at Bell Labs. But that work was done on a shoestring budget and on a part-time basis. Upon arriving at Beckman, he was able to pursue the work full-time.

"The goal was to build a machine that could use language exactly the way humans do," Levinson said. "The central technical idea was that the only way to do that was to build a robot."

Levinson said teaching the robots the semantics, or meaning, of language is crucial to building a mechanical mind.

"The meaning drives everything. The only reason that language is interesting is because we can express meaningful ideas," he said. "We even have a definition of what that is, a behavioral definition. If the robot can roam around its environment and if it can understand spoken commands and spoken instructions or other communication, and if it can take the appropriate action, then it in some sense understands language. And that is meaning."

Currently, the comprehension level of any of the robots is rudimentary.

"Right now the best comparison is it understands language the way a faithful dog does," Levinson said. "It recognizes a few words, a dozen words or so. It can reliably understand those words." The next step involves making associations between words. "That's what we're working on now. For the very first time, we're going to try and learn some language."

A paper for the Fall 2005 issue of IEEE's Transactions on Evolutionary Computation by Levinson and Ph.D. student Kevin Squire called HMM-Based Semantic Learning for a Mobile Robot describes their reasons and methods for teaching language to robots. They write that different scientists have studied cognitive development in various environments, but "our study occurs in a robotics lab, where we attempt to embody cognitive models in steel and silicon."

Levinson said the results detailed in the paper showed the robot could learn language. "That was really exciting because it learned to do things under voice command. So you actually tell it to do something and it would do it. That was all learned behavior."

Levinson said Alan is barely used these days while Illy is "the workhorse" and Norbert is a more compact version of Illy. The robots can manipulate objects and navigate using sensors. It has grippers and shoulders that can raise and lower, while sensors provide tactile sense on both the hand and around the perimeter of the robot. There are cameras for eyes that are mounted on a head that both pans and tilts, with input coming from binocular color vision and binaural hearing. Drive wheels allow the robot to roam around the lab.

The software is a real-time distributed operating system powering three onboard computers connected by wireless ethernet to a small network of computers in the laboratory. Levinson said his group is the only one writing this kind of software.

"We are simultaneously doing vision, hearing, motion planning, and cognitive function, navigation, and object manipulation, all at once," he said. "And that has to be done in real time, online. This is one of the critical differences between this and symbolic AI. This is not simulated. This is real steel and rubber, so all of this stuff takes a huge amount of computation that has to be done all together."

Levinson said another important aspect of the work is that many researchers and research groups concentrate on individual components of the work, focusing on just vision, or speech and language, or navigation, or learning.

"We're putting the whole ball of wax all together and it all has to operate together," he said. "It's a significant technical challenge as well as a scientific one. We have done a very good job, if I do say so myself, against the technical challenges. We have a pretty good hardware platform, and quite a nice software system."

Levinson said there are a small group of researchers, mostly in the United States and Japan, who are working on the idea of an embodied intelligent system, but his group's focus on semantics sets them apart.

"Amongst all those people, this project is the only one that has dared to deal with language," he said. "Others are trying to learn motor control or vision or navigation, but no one has touched language."

Even though the fruits of the project's labor are long-term, there have been some technical spinoffs from the research. Levinson said a two-microphone array and an algorithm that student Dan Feng Li wrote could be useful for some multimedia communication processes such as teleconferencing or for sound location or pickup in noisy environments. Levinson said that was an unexpected application byproduct coming out of the research, as was the real-time distributed operating system.

"It is a real-time distributed memory, distributed processor operating system," he said. "It's a piece of software unto itself. It could have industrial uses, process control, manufacturing control."

But the goal of the research at this point is not technology applications.

"It's focus is basic science," Levinson said. "That's increasingly rare in these days. And I think that has to be one of the themes of Beckman. We are the Institute for Advanced Science and Technology. We're not just technology."