Project Image Mixture of Manhattan Frames
Man-made objects and buildings exhibit a clear structure in the form of orthogonal and parallel planes. This observation, commonly referred to as the Manhattan-world (MW) model, has been widely exploited in computer vision and robotics. At both larger and smaller scales, the scale of a city, indoor scenes or smaller objects, a more flexible model is merited. Here, we propose a novel probabilistic model that describes scenes as mixtures of Manhattan Frames (MF) - sets of orthogonal and parallel planes. By exploiting the geometry of both orthogonality constraints and the unit sphere, our approach allows us to describe man-made structures in a flexible way, We propose an inference that is a hybrid of Gibbs sampling and gradient-based optimization of a robust cost function over the SO(3) manifold. An MF merging mechanism allows us to infer the model order. We show the versatility of our Mixture-of-Manhattan-Frames (MMF) model by describing complex scenes from ASUS Xtion PRO depth images and aerial-LiDAR measurements of an urban center. Additionally, we demonstrate that the model lends itself to depth focal-length calibration of RGB-D cameras as well as to plane segmentation.

People Involved: Julian Straub, Guy Rosman, Oren Freifeld, John J. Leonard, John W. Fisher III

update: the MMF code is online under an academic license - download it here.

update: we provide the inferred MMFs for all of the NYU V2 depth dataset here.

update: the CVPR talk is online.

Man-made objects and buildings exhibit a clear structure in the form of orthogonal and parallel planes. This observation, commonly referred to as the Manhattan-world (MW) model, has been widely exploited in computer vision and robotics. At both larger and smaller scales, the scale of a city, indoor scenes or smaller objects, a more flexible model is merited. Here, we propose a novel probabilistic model that describes scenes as mixtures of Manhattan Frames (MF) - sets of orthogonal and parallel planes. By exploiting the geometry of both orthogonality constraints and the unit sphere, our approach allows us to describe man-made structures in a flexible way, We propose an inference that is a hybrid of Gibbs sampling and gradient-based optimization of a robust cost function over the SO(3) manifold. An MF merging mechanism allows us to infer the model order. We show the versatility of our Mixture-of-Manhattan-Frames (MMF) model by describing complex scenes from ASUS Xtion PRO depth images and aerial-LiDAR measurements of an urban center. Additionally, we demonstrate that the model lends itself to depth focal-length calibration of RGB-D cameras as well as to plane segmentation.

Mixture of Manhattan Frames extracted from RGB-D scenes
Several Scenes with Extracted MMF

Inference results of the MMF model over different kinds of indoor scenes. We show the RGB images of the scene (1st row), the assignment to MFs in the 2nd row (MF 1 red, MF 2 green, MF 3 blue). We depict a plane segmentation for each scene in the 3rd row. The last row shows the negative log likelihood of each normal under the inferred model decreasing probability from black over red to yellow. Black areas in the images designate pixels with no depth information

Mixture of Manhattan Frames extracted from Cambridge LiDAR data
MMF of Cambridge LiDAR data

MMF extraction from the LiDAR scanned urban scene. There is a clear separation into three MFs colored red, green and blue with the orientations indicated by the axes in the top-left corner. Normals associated with the upward pointing axes of the MFs were colored in gray to reveal the composition of the scene more clearly.

Mixture of Manhattan Frames Code

We are releasing the MMF inference code under an academic license. You can download it from here.

Mixture of Manhattan Frames extracted from NYU depth dataset V2

We ran the MMF inference on the full NYU depth dataset V2 consisting of N=1449 RGB-D frames and provide the results as a dataset which can be found here.

Download the paper and supplemental material and watch the CVPR talk. Code can be found here.

Press



[2014] Julian Straub, Guy Rosman, Oren Freifeld, John J. Leonard, John W. Fisher III, "A Mixture of Manhattan Frames: Beyond the Manhattan World", In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014. [bib]
Powered by bibtexbrowser