Modelling Nonlinear Shape-and-Texture Appearance Manifolds

 
Authors
 
C. Mario Christoudias (cmch at csail.mit.edu)
Trevor Darrell (trevor at csail.mit.edu)
 
Abstract
 

Statistical shape-and-texture appearance models (also known as deformable models) [1,4] use image morphing to define a rich, compact representation of object appearance. Deformable models are useful in a variety of applications including object recognition, tracking and segmentation [4, 5, 6]. These techniques, however, have been limited to object classes that exhibit relatively simple appearance variation and as such they cannot be applied to many interesting object classes such as hands or mouths. The appearance manifold of such object classes is highly nonlinear and their appearance cannot be linearized into shape and texture by applying a global morphing operator. To see this consider morphing an open and closed mouth---the intermediate images of the morph would contain holes and folds caused by folding the open mouth and stretching the open mouth. Instead these objects have an apperarance manifold with a varying topology, with different regions of the space correlated with the same parts of the mouth or hand being visible and holes in the manifold corresponding to inadmissible appearance, i.e., that caused by interpolating the appearance of an open and closed mouth.

In addition to having a complex appearance manifold, these object classes also have shape vectors with varying dimensionality determined by what parts of the object are visible in the image. For example, the image of an open mouth contains shape features for the teeth that are absent from the image of a closed mouth. In this work we propose two complementary, nonlinear deformable models that can model object classes whose appearance manifold has a varying topology and whose shape has a varying dimensionality. To the author's knowledge, this is the first work to describe a nonlinear model of shape and appearance. We demonstrate our models using mouth images from speaking person video sequences and compare them to a baseline linear model. An extensive discussion of this work can be found in the publication below. Due to space constraints, additional results not in the publication are available below.

Additional Results

 
Publication
 
  • C. Mario Christoudias and Trevor Darrell. On Modelling Nonlinear Shape-and-Texture Appearance Manifolds. IEEE Conference on Computer Vision and Pattern Recognition, June 2005, San Diego, California.
 
References
 
  1. T. F. Cootes, G. J. Edwards, and C. J. Taylor. Active appearance models. Lecture Notes in Computer Science, 1407:484–98, 1998.


  2. G. Edwards, C. Taylor, and T. Cootes. Interpreting face images using active appearance models. In 3rd International Conference on Automatic Face and Gesture Recognition, pages 300–305, 1998.


  3. T. J. Hazen, K. Saenko C. H. La, and J. Glass. A segmentbsaed audio-visual speech recognizer: Data collection, development, and initial experiments. In Proc. ICMI, 2005.


  4. Michael J. Jones and Tomaso Poggio. Multidimensional morphable models. In ICCV, pages 683–688, 1998.


  5. Stan Sclaroff and John Isidoro. Active blobs. In ICCV, Mumbai, India, 1998.


  6. M. B. Stegmann. Analysis and segmentation of face images using point annotations and linear subspace techniques. Technical report, DTU, 2002.
 
Vision Interfaces Group - May 2005