Multimodal Interaction With Speech & Gestures


Principal Investigator:

David Demirdjian 


The main objective of this work is to design a Perceptual User Interface that provides:

·        the pose of a user (location and orientation of arms, head)

·        the detection of gestures


Pose and gesture information are then combined with speech recognition to interact with an application, e.g. virtual world navigation and interaction, video game, interaction with an avatar. 



This work consists in segmenting and tracking in real-time the gestures of a user observed by a stereo camera. More precisely, a 3D model of a person is compared to the set of 3D points reconstructed by stereo and updated using a technique similar to ICP (Iterative Closest Point). The designed technique is fast (15Hz / Pentium IV) and robust to illumination conditions.



