Robot Kits
Stiquito Kit
BattleKits
Robot Toys
Solar Kits
Robot Arms
Robosapien
Basic Stamp Kits
Lego MindStorms
Beginners
Books
Hobby Robots
Robot Sports
Electronics
Mechanics
Robot Minds
Books for Kids
Robots at Work
Microcontrollers
Advanced Books
Used Books
Real Robots
Robot Motors
Remote Controls
Robot Parts
Robot Tools
Magazines
Robot Videos
Robot News
RoboLinks
Contact
|
Courtesy of New Scientist Magazine
By Duncan Graham-Rowe
THE YOUNGSTER EYES ME suspiciously as I enter the room, its gaze following as I cross the
floor. Then after a while, it loses interest and turns back to its toy dinosaur. But then
I never was any good with kids.
When Rodney Brooks set out to build a humanoid robot with the intelligence of a
two-year-old child, he didn't realize what he was letting himself in for. Six years later,
he and his team at the Massachusetts Institute of Technology have transformed themselves
from artificial intelligence (AI) experts into the most unlikely bunch of developmental
psychologists and nannies.
Colorful toys litter the labs, and much of their time is spent playing with and
entertaining their charges. This is because Cog and its alter ego Kismet are the first of
a new type of robot designed to behave in the same way as small children. If you want to
create a robot with the intelligence of a two-year-old, Brooks reasoned, the best approach
was to give it the innate abilities of a newborn and let it develop.
Motors and software may control Cog and Kismet but it's difficult to believe that this is
all that makes them tick. Cog may be only a head, arms and torso, but as you watch it
explore the world you can't help feeling that something profound is going on. It moves
smoothly like a creature rather than with the abrupt, mechanical movements of a machine.
And its eyes dart from one object to the next, with its head slowly bringing up the rear,
as though it were human.
Kismet is even more convincing. It may be just a head, but it has more interesting facial
features than its larger cousin, making its moods easier to interpret. Complete with
eyelids, ears and newly acquired lips, it has the appearance of something young and cute,
and reacts to events with an impressive repertoire of doe-eyed expressions ranging from
surprise and interest to sadness and anxiety.
One of the unexpected findings from the project so far is that the appearance of these
human-like actions and reactions is not just an optional extra. The robots, like children,
will not develop unless their caregivers read more into their behavior than is actually
there.
The idea of building a robot with human intelligence, even that of a child, is not only
ambitious, it's highly unconventional. Most AI researchers confine themselves to
recreating a single sense, such as vision, or simple behaviors--not the whole shebang.
"The hard-core roboticists are not comfortable with our work because we are not doing
the same sorts of things," says Brooks. If it wasn't that Cog does what it's supposed
to, he says, his team would have been written off long ago.
But then Brooks has always been controversial. In his work on robot insects in the 1980s,
he rejected the idea of a central "brain" and showed how intelligent behaviors
could emerge from cooperation between a number of simple, independent systems. Each leg of
Genghis, a six-legged robot, for example, had its own simple controller, and walking
"emerged" by timing the actions of the controllers.
Brooks also pioneered the idea of increasing the complexity of robot behavior by building
up a hierarchy of simple systems. When "whiskers" fitted to Genghis detected an
obstacle, they modulated the signaling between the legs in such a way that the robot
walked over the obstacle as though it "knew" what was before it.
His approach broke the traditional mould of AI, which tended to treat intelligence as a
problem that could be encoded in rules, and hence software. Give a robot a software
replica of its world to refer to, and a set of instructions for negotiating that world,
and it would appear to act intelligently. By contrast, Brooks argued that robots need no
internal description of the world: for him, intelligent behaviors emerge only when a robot
exists in the world and interacts with it.
With the Cog project, Brooks wants to take his ideas further. "We're trying to build
a Commander Data," he says with a smile. He knows it's a lofty aim, the android from Star
Trek: Next Generation is smarter by far than the average life form.
Four eyes
At present, Cog is still a collection of isolated computer-controlled systems. For eyes,
it has four cameras, two for peripheral vision and two for high-resolution, narrow
viewing, which all move in their sockets in the same way as human eyes. It also has a
movable head, neck, arms and gripper-hands. In addition, Cog has an auditory system that
gives it enough information to know where sounds come from, a basic sense of balance and
the rudiments of a sense of touch. Most of its motors come with position sensors to let
Cog know where the rest of its "body" is, and strain and temperature gauges so
it doesn't overload anything.
These are the basic systems--like the legs and whiskers of Genghis--which when they are
connected to each other will allow Cog to display intelligent behaviors. Already, it can
find faces, tell if a person is looking at it, detect movement, copy head gestures such as
nodding, and even play with a Slinky. "We also do some very basic auditory stream
segregation, so you can get the sound of my voice away from the fans in the
background," says Brian Scassellati, a cognitive scientist and one of Cog's principal
architects.
What makes Cog so different from its insect predecessors is that Brooks and his team are
not simply trying to build an increasingly complex system and watch what emerges: they are
trying to produce specific human behaviors.
Consider looking somebody in the eye. Cog does this in a series of steps. First, it
notices that someone is present by detecting motion in its peripheral vision. Then it
checks whether the moving object has a face, using an algorithm that matches patterns of
light and shade to a template made from shadows cast by faces in different lighting
conditions and orientations. Once it finds a face, Cog's eyes, then its head, turn towards
the face and it matches the peripheral image to the high-resolution image. This done, a
second template picks out the eyes within the image of the face. The behavior appears
lifelike and the robot looks you in the eye in real time, without needing massive
computing power.
In line with Brooks' original ideas, all Cog's actions and abilities are discrete and
built in hierarchies. The result is that new behaviors, such as detecting a waving hand or
recognizing individual faces, can be built on top without redesigning the existing
systems.
Still, for the most part, Cog's abilities are purely reactive. If it is ever to fulfil
Brooks's dream of learning like a child, Cog is going to need a memory, and so far the
team has not tackled that problem (see "No past, no future, " below). The
researchers have, however, begun to look at one aspect of learning that they originally
took for granted, but which has proved to be crucial to Cog's development--social
interaction.
As infants we learn gradually, expanding our abilities as our carers slowly increase the
complexity of the tasks they set us. This incremental approach relies heavily on social
interaction. It had always been part of the plan to make Cog communicate with people, but
its designers hadn't counted on how complicated a process this is. The robot must be able
not only to understand the intentions of its carer but also to impart its own intentions.
This might be straightforward but for the fact that Cog doesn't have any intentions or
know how to express them--which makes it similar to very young babies.
Cynthia Breazeal, principal researcher on Kismet, has wrestled with this problem. The
solution, she believes, is that young babies are great at giving the impression that there
is more going on inside them than there really is. Developmental psychologists argue that
young babies are capable of showing only a few, innate expressions. Yet parents assume
from the beginning that their babies behave in the way they do for meaningful reasons.
Parents interpret facial and vocal expressions as indicators of how a baby is
feeling--so if a child is content, parents tend to maintain their level of interaction,
but if the child appears uninterested or upset they may intensify the interaction or
change it, to regain the baby's attention. Children learn from the consistency of the
parents' reactions how to manipulate their parents, and so gain attention. This puts them
in an ideal position to carry on learning.
These ideas are embodied in Cog's cousin. "Kismet takes advantage of the way we are
programmed to interact with small children," says Breazeal. It was designed to
emotionally blackmail people. If you don't do what it wants it scowls or looks sad: if you
please it, it rewards you with a smile or a look of interest.
Kismet is actually a platform for testing behavioral systems before installing them in
Cog. It has a vision system similar to Cog's, able to detect motion and faces. But Kismet
also has what Breazeal calls a motivational system, which tries to keep the robot in a
happy, interested state. This consists of a collection of drives, such as the urges to be
social and stimulated, and behaviors that satisfy those drives (see Diagram, p 46). The
intensities of the drives, which dictate the expression on Kismet's face, increase if they
are not satisfied and decrease when the appropriate behaviors are operating.
So, if Kismet is left alone, the intensity of its social drive increases. This makes it
look sad, communicating to anyone passing that it craves attention, and activating a
behavior called socialize. As soon as it detects a face, it fixes on it and begins to
socialize--sadness changes to happiness or interest and the intensity of its social drive
begins to fall. If, however, the interaction is too intense and the social drive is pushed
too far, the robot becomes overwhelmed and gives a look of displeasure.
Similar events take place with Kismet's "stimulation" drive, which is
counter-balanced by a behavior called play. At present, Kismet's favorite object is its
toy inchworm. If left alone, the robot looks sad and its play behavior is activated, but
bounce the worm and it starts to look happy and the desire to be stimulated begins to
fall. If you bounce the worm too fast, however, Kismet looks disgusted. And if you carry
on, then disgust changes to anger, or it may simply close its eyes and sleep to allow its
"brain" to catch up. In an added refinement, if you repeat the same movement
repeatedly then Kismet gets bored and looks sad again.
So humans watching Kismet's expressions take them as signs of how it is
"feeling" and modify their behavior to restore it to a contented state--just as
they would with a child. Eventually, its happy and interested expressions will signify
that it is absorbing information at the optimum rate. So, Kismet's expressions will
regulate people's actions to let it learn at that pace.
The team is now working on a system that will allow Kismet to respond to the inflection in
people's voices, so that its emotional state can be changed by auditory as well as visual
stimuli. And soon, says Scassellati, they will introduce a vocal system so it can babble
like a baby. This will let Kismet attract people's attention even when they're not looking
at it, and let them know if it is happy or sad by cooing or crying. Eventually, with some
form of software capable of learning, the vocal system could play a part in helping Kismet
to develop language.
To some, Breazeal's approach might seem pointless: after all, what could bouncing a toy
inchworm teach a robot about the world? The same, however, could be argued of infants, and
yet it clearly works for them. This sort of exercise teaches infants more about themselves
than their environment, such as how to reach for an object.
It also gives them the opportunity to forge links between different sensory perceptions.
Children might learn, for example, to associate neural signals for bright colors with
those for movement, thereby realizing that the bright thing and the moving thing they are
seeing are one and the same. Breazeal and the team hope to see Cog learning in the same
way.
Such basic skills are essential for children because they pave the way for more complex
tasks later. To appreciate this, take the seemingly simple job of understanding what
someone is referring to when they point at an object. Only other apes and dolphins are
able to grasp that there's something "over there" worth looking at, and then
find the object of interest. This is one form of a skill, called joint attention, which
Cog needs to have because so much social interaction depends upon it, says Scassellati.
He hopes to give Cog this ability by following the idea that joint attention is built of
simpler skills. For infants to grasp the meaning of pointing, they first have to develop
the skill of knowing when someone is looking at them, and then learn to follow that
person's gaze and finger. Evidence for this composite nature comes from observing child
development.
While most three-month-olds have developed eye contact, it is not until nine months that
they follow another person's gaze, and 18 months that they follow someone's gaze out of
their field of view. And before they can single out the object being pointed at, it seems
that infants first go through a phase of following the person's gaze and finger, and
fixating on the first object their eyes see. Cog already has some of the basic skills
needed for joint attention, such as the ability to maintain eye contact. It can also learn
to point at an object. "Once we have enough of these rudimentary social pieces then
we can actually start to learn from people," says Scassellati.
And this is really the crucial point about the Cog project. It is not about wiring the
robot up and turning it on. But rather about gradually increasing Cog's skills, and
watching as the rich-ness and complexity of its behavior increases, hoping that one day
"intelligence" will emerge.
In the next few months the team plans to carry out psychological tests to see if Kismet
can manipulate its carers as well as children manipulate theirs. A number of new heads are
also on the way, including one that will transfer Kismet's abilities to Cog, which should
make it appear more like a child than ever.
Already, the robots seem so human that there's a strong temptation to think of them as
male or female. This is even though the Cog team has worked hard to keep its charges
sexless. Still, if the researchers are ever to achieve their goal of making a robot that
interacts without making people feel uncomfortable, then perhaps they need to address this
issue. We humans, after all, are either one thing or the other.
BOX: No past, no future
OF THE PROBLEMS still to be faced by Rodney Brooks and his team at MIT, some of the
thorniest involve memory. A robot built using the traditional techniques of AI would have
a model of the world built into its software controls. So when it saw something red, the
robot would "know" it was red because its software would tell it so. Likewise,
if it needed to remember something--the position of an object, say--an instruction would
tell it to do so.
Cog has no such model or instructions because the aim is let it choose what to remember.
So how will it make this decision? Or know that it's seeing red? And how will the concept
of redness be represented in the robot's memory? "In traditional AI you would never
think of this as being a problem," says Brooks's colleague, Brian Scassellati. But
for Cog these are major stumbling blocks that the team have yet to address.
Other things that we take for granted become real difficulties for Cog. Without a sense of
time, for example, Cog will not be able to order its thoughts or know the difference
between past and present. "There are so many problems," Scassellati sighs.
But then, a few years ago, social interaction seemed a colossal problem. And today, the
team's approach of breaking behaviors down into simpler steps is starting to pay
dividends. The question is whether storing and retrieving memories will fall to the same
approach.
Subscribe to New Scientist
|
|