- Detect Prosody in human speech and show appropriate facial responses
We have demonstrated a robust technique for
recognizing affective intent in robot-directed speech. By analyzing
the prosody of a person's speech, Kismet can determine whether it was
praised, prohibited, soothed, or given an attentional bid. The robot
can distinguish these affective intents from neutral robot-directed
speech. The output of the recognizer modulates the robot's emotional
models, inducing an appropriate affective state with a corresponding
facial expression (an expression of happiness when praised, sorrow
when prohibited, interest when alerted, and a relaxed expression for
soothing). In multi-lingual experiments with naive female subjects, we
found that the robot was able to robustly classify the four affective
intents. In addition, the subjects intuitively inferred when their
intent had been properly understood by Kismet's expressive feedback.
Watch it in action (Quicktime movie):
In this video clip, Kismet correctly interprets 4 classes of affective
intent: praise, prohibition, attentional bids, and soothing. These were
taken from cross-lingual studies with naive subjects. The robot's
expressive feedback is readily interpreted by the subjects as well. [5.3MB]
For more information see:
Cynthia Breazeal and Lijin Aryananda. "Recognition of Affective
Communicative Intent in Robot-Directed Speech". submitted to the
IEEE-RAS International Conference on Humanoid Robots 2000.
Cynthia Breazeal. "Sociable Machines: Expressive Social Exchange
Between Humans and Robots". Massachusetts Institute of Technology,
Department of Electrical Engineering and Computer Science, PhD Thesis,
May 2000. [SEE CHAPTER 7]
[6.5MB]
[20.7MB compressed]
- Expressive feedback through face, voice, and body posture
We have implemented expressive feedback in
multiple modalities on Kismet. The robot is able to express itself
through voice, facial expression, and body posture. We have evaluated
the readability of Kismet's expressions for anger, disgust, fear,
happiness, interest, sorrow, surprise, and some interesting blends
through numerous studies with naive human subjects.
Watch it in action (Quicktime movie):
In this video clip, Kismet says the phrase "Do you really think so" with
varying emotional qualities. In order, the emotional qualities
correspond to calm, anger, disgust, fear, happy, sad, interest. [3.6MB]
For more information see:
Cynthia Breazeal and Brian Scassellati. "How to Build Robots
That Make Friends and Influence People". to appear in IROS99, Kyonjiu,
Korea, 1999.
Cynthia Breazeal. "Sociable Machines: Expressive Social Exchange
Between Humans and Robots". Massachusetts Institute of Technology,
Department of Electrical Engineering and Computer Science, PhD Thesis,
May 2000. [SEE CHAPTERS 11 AND 12]
[6.5MB]
[20.7MB compressed]
- Visual attention and gaze direction
We have implemented a visual attention system on
Cog and Kismet based on Jeremy Wolfe's model of human visual search.
We have tested the robustness of the attention system on these robots.
By matching the robot's visual system to what humans find to be
inherently salient, the robot's attention is often drawn to the same
sorts of stimuli that humans do. In studies with naive subjects, we
found that people intuitively use natural attention-grabbing cues to
quickly direct the robot's attention (motion, proximity, etc.). The
subject's intuitively use the robot's gaze and smooth pursuit behavior
to determine when they have successfully directed the robot's
attention.
Watch it in action (Quicktime movies):
In this video clip, Kismet is searching for a toy. It's facial
expression and eye movements make it readily appearant to an observer
when the robot has discovered the colorful block on the stool. The
attention system is always running and enabling the robot to respond
appropriately to unexpected stimuli (such as the person entering from
the right hand side of the frame to take away the toy). Notice how
Kismet appears a bit crest-fallen when it's toy is removed. [3.9MB]
These three movies show the visual attention system in action:
This clip illustrates the color saliency process. The left frame
is the video signal, the right frame shows how the colorful block is
particularly salient. The middle frame shows the raw saliency value
due to color. The bright region in the center is the habituation
influence. [6.8MB]
This clip illustrates the motion saliency process. [5.9MB]
This clip illustrates the face saliency process (center) and the
habituation process (right). [14.3MB]
For more information see:
Cynthia Breazeal and Brian Scassellati. "A Context-dependent
Attention System for a Social Robot". In Proceedings of the Sixteenth
International Joint Conference on Artificial Intelligence (IJCAI99),
Stockholm, Sweden, pp.1146-1151, 1999.
Cynthia Breazeal. "Sociable Machines: Expressive Social Exchange
Between Humans and Robots". Massachusetts Institute of Technology,
Department of Electrical Engineering and Computer Science, PhD Thesis,
May 2000. [SEE CHAPTERS 6 AND 13]
[6.5MB]
[20.7MB compressed]
Cynthia Breazeal, Aaron Edsinger, Paul Fitzpatrick and Brian
Scassellati. "Social Constraints on Animate Vision". submitted to the
IEEE-RAS International Conference on Humanoid Robots 2000.