The Video Mesh: A Data Structure for Image-based Three-dimensional Video Editing
The video mesh data structure represents the structural information in
an input video as a set of deforming texture mapped triangles augmented
with mattes. The mesh has a topology that resembles a "paper cutout."
This representation enables a number of special effects applications
such as depth of field manipulation, object insertion, and change of 3D
viewpoint.
Paper: PDF (ICCP 2011)
Supplemental material: PDF
Video (high bitrate): MP4 (67MB, 640x480p24, contains audio)
Video (low bitrate): MP4 (29MB, 640x480p24, contains audio)
Anaglyphic (red/cyan) results: AVI (Lagarith lossless codec, 246 MB, 640x480p24, no audio)
Stills of anaglyphic results: PNGs (best viewed full screen on a 24" (61 cm) monitor at a distance of 30" (76 cm)
Slides from ICCP 2010: PPTX (PowerPoint 2010, WMV videos embedded, 81 MB)
Abstract
This paper introduces the video mesh, a data structure for
representing video as 2.5D "paper cutouts." The video mesh allows
interactive editing of moving objects and modeling of depth, which
enables 3D effects and post-exposure camera control. The video mesh
sparsely encodes optical flow as well as depth, and handles occlusion
using local layering and alpha mattes. Motion is described by a sparse
set of points tracked over time. Each point also stores a depth value.
The video mesh is a triangulation over this point set and per-pixel
information is obtained by interpolation. The user rotoscopes occluding
contours and we introduce an algorithm to cut the video mesh along them.
Object boundaries are refined with per-pixel alpha values. The video
mesh is at its core a set of texture mapped triangles, we leverage
graphics hardware to enable interactive editing and rendering of a
variety of effects. We demonstrate the effectiveness of our
representation with special effects such as 3D viewpoint changes, object
insertion, depth-offield manipulation, and 2D to 3D video conversion.