Current Research Projects
|Small-Variance Nonparametric Clustering on the Hypersphere|
Structural regularities in man-made environments reflect in the distribution of their surface normals. Describing these surface normal distributions is important in many computer vision applications, such as scene understanding, plane segmentation, and regularization of 3D reconstructions. Based on the small-variance limit of Bayesian nonparametric von-Mises-Fisher (vMF) mixture distributions, we propose two new flexible and efficient k-means-like clustering algorithms for directional data such as surface normals. The first, DP-vMF-means, is a batch clustering algorithm derived from the Dirichlet process (DP) vMF mixture. Recognizing the sequential nature of data collection in many applications, we extend this algorithm to DDP-vMF-means, which infers temporally evolving cluster structure from streaming data. Both algorithms naturally respect the geometry of directional data, which lies on the unit sphere. We demonstrate their performance on synthetic directional data and real 3D surface normals from RGB-D sensors. While our experiments focus on 3D data, both algorithms generalize to high dimensional directional data such as protein backbone configurations and semantic word vectors.
|Semantically-Aware Aerial Reconstruction from Multi-Modal Data|
We consider a methodology for integrating multiple sensors along with semantic information to enhance scene representations. We propose a probabilistic generative model for inferring semantically-informed aerial reconstructions from multi-modal data within a consistent mathematical framework. The approach, called Semantically Aware Aerial Reconstruction (SAAR), not only exploits inferred scene geometry, appearance, and semantic observations to obtain a meaningful categorization of the data, but also extends previously proposed methods by imposing structure on the prior over geometry, appearance, and semantic labels. This leads to more accurate reconstructions and the ability to fill in missing contextual labels via joint sensor and semantic information. We introduce a new multi-modal synthetic dataset in order to provide quantitative performance analysis. Additionally, we apply the model to real-world data and exploit OpenStreetMap as a source of semantic observations. We show quantitative improvements in reconstruction accuracy of large-scale urban scenes from the combination of LiDAR, aerial photography, and semantic data. Furthermore, we demonstrate the model's ability to fill in for missing sensed data, leading to more interpretable reconstructions.
| A Fast Method for Inferring High-Quality Simply-Connected Superpixels|
Superpixel segmentation is a key step in many image processing and vision tasks. Our recently-proposed connectivity- constrained probabilistic model  yields high-quality superpixels. Seemingly, however, connectivity and parallelized inference cannot coexist. Thus, the implementation from  is serial, hence slow. The contributions of this work are as follows. First, we show that effective parallelization is in fact possible. This leads to a fast GPU implementation that scales gracefully with both the number of pixels and number of superpixels. Second, we show that the superpixels are improved by replacing the fixed and restricted spatial covariances from  with unrestricted Bayesian estimates. Quantitative evaluation on public benchmarks shows the proposed method outperforms the state-of-the-art.
|Efficient Diffeomorphisms and Their Applications|
We propose novel finite-dimensional spaces of well-behaved Rn → Rn transformations. These transformations are obtained by (fast and highly-accurate integration) of continuous piecewise-affine velocity fields. The proposed method is simple yet highly expressive, effortlessly handles optional constraints such as volume preservation and/or boundary conditions, and supports convenient modeling choices such as smoothing priors and coarse-to-fine analysis. Importantly, the proposed approach, partly due to its rapid likelihood evaluations and partly due to its other properties, facilitate tractable inference over rich transformation spaces, including using methods based on Markov-Chain Monte-Carlo (MCMC). Its applications include, but are not limited to: monotonic regression (more generally, optimization over monotonic functions); modeling cumulative distribution functions or histograms; time-warping; image warping; image registration; real-time diffeomorphic image editing. Other applications include data augmentation for image classifiers. Finally, we provide a GPU-accelerated code.
|Mixture of Manhattan Frames|
Man-made objects and buildings exhibit a clear structure in the form of orthogonal and parallel planes. This observation, commonly referred to as the Manhattan-world (MW) model, has been widely exploited in computer vision and robotics. At both larger and smaller scales, the scale of a city, indoor scenes or smaller objects, a more flexible model is merited. Here, we propose a novel probabilistic model that describes scenes as mixtures of Manhattan Frames (MF) - sets of orthogonal and parallel planes. By exploiting the geometry of both orthogonality constraints and the unit sphere, our approach allows us to describe man-made structures in a flexible way, We propose an inference that is a hybrid of Gibbs sampling and gradient-based optimization of a robust cost function over the SO(3) manifold. An MF merging mechanism allows us to infer the model order. We show the versatility of our Mixture-of-Manhattan-Frames (MMF) model by describing complex scenes from ASUS Xtion PRO depth images and aerial-LiDAR measurements of an urban center. Additionally, we demonstrate that the model lends itself to depth focal-length calibration of RGB-D cameras as well as to plane segmentation.
|A Dirichlet Process Mixture Model for Spherical Data|
Directional data, naturally represented as points on the unit sphere, appear in many applications. However, unlike the case of Euclidean data, flexible mixture models on the sphere that can capture correlations, handle an unknown number of components and extend readily to high-dimensional data have yet to be suggested. For this purpose we propose a Dirichlet process mixture model of Gaussian distributions in distinct tangent spaces (DP-TGMM) to the sphere. Importantly, the formulation of the proposed model allows the extension of recent advances in efficient inference for Bayesian nonparametric models to the spherical domain. Experiments on synthetic data as well as real-world 3D surface normal and 20-dimensional semantic word vector data confirm the expressiveness and applicability of the DP-TGMM.
Past Research Projects
|Aerial Reconstructions via Probabilistic Data Fusion|
We propose an integrated probabilistic model for multi-modal fusion of aerial imagery, LiDAR data, and (optional) GPS measurements. The model allows for analysis and dense reconstruction (in terms of both geometry and appearance) of large 3D scenes. An advantage of the approach is that it explicitly models uncertainty and allows for missing data. As compared with image-based methods, dense reconstructions of complex urban scenes are feasible with fewer observations. Moreover, the proposed model allows one to estimate absolute scale and orientation, and reason about other aspects of the scene, e.g., detection of moving objects. As formulated, the model lends itself to massively-parallel computing. We exploit this in an efficient inference scheme that utilizes both general purpose and domain-specific hardware components. We demonstrate results on large-scale reconstruction of urban terrain from LiDAR and aerial photography data.
In this paper, we develop a generative probabilistic model for temporally consistent superpixels in video sequences. Unlike supervoxel methods, the same temporal superpixel in different frames tracks the same part of an underlying object. Our method explicitly models the flow between frames with a bilateral Gaussian process and uses this information to propagate superpixels in an online fashion. We present four new metrics to measure performance of a temporal superpixel representation and find that our method outperforms supervoxel methods.
|Bayesian Nonparametric Modeling of Driver Behavior|
Modern vehicles are equipped with increasingly complex sensors. These sensors generate large volumes of data that provide opportunities for modeling and analysis. Here, we are interested in exploiting this data to learn aspects of behaviors and the road network associated with individual drivers. Our dataset is collected on a standard vehicle used to commute to work and for personal trips. A Hidden Markov Model (HMM) trained on the GPS position and orientation data is utilized to compress the large amount of position information into a small amount of road segment states. Each state has a set of observations, i.e. car signals, associated with it that are quantized and modeled as draws from a Hierarchical Dirichlet Process (HDP). The inference for the topic distributions is carried out using MCMC split-merge sampling and online variational inference algorithms. The topic distributions over joint quantized car signals characterize the driving situation in the respective road state. In a novel manner, we demonstrate how the sparsity of the personal road network of a driver in conjunction with a hierarchical topic model allows data driven predictions about destinations as well as likely road conditions.
Optimal sensor selection is combinatorially complex and hence intractable for large scale problems. Under mild conditions, greedy heuristics have proven to achieve performance within a factor of the optimal. Mutual information, a commonly used reward in information theory, can lead to "myopic" selection since it makes no use of the costs assigned to measurements. In addition, the particular choice of the visit walk greatly affects the outcome. In this project, we will examine conditions under which cost-penalized mutual information may achieve similar guarantees to that of mutual information. Lastly, we will explore ways to make informed choices of the visit walk, examine whether locally optimizing exchange algorithms can improve the results of greedy approaches and work on finding more efficient ways to compute information rewards.
|Bayesian Structure Inference and Interaction Analysis|
We investigate models and algorithms for Bayesian inference of time-varying dependencies (interactions) among multiple time-series from noisy observations. Analyzing such dependencies is important in many domains, such as social networks, finance, biology and object interaction analysis. We cast the problem of inference over dependence structures as the problem of learning the structure of a dynamic Bayesian network (DBN). This problem is inherently hard. Interactions are rarely observed directly and need to be inferred from noisy observations of objects’ properties. At the same time, the number of possible dependence structures is super-exponential in the number of time-series. To deal with uncertainty, we adopt a fully-Bayesian approach, in which our objective is to characterize a full posterior distribution over dependence structures. The probability of any structural event, such as “Is there an edge from A to B?”, can be easily computed from the full posterior. We use a modular prior and a bound on the number of parent sets per object to reduce the number of structures (at a single time point) from super-exponential to polynomial in the number of time-series.
|MCMC Sampling over Shapes|
We present a method for sampling from the posterior distribution of implicitly defined segmentations conditioned on the observed image. Segmentation is often formulated as an energy minimization or statistical inference problem in which either the optimal or most probable configuration is the goal. Exponentiating the negative energy functional provides a Bayesian interpretation in which the solutions are equivalent. Sampling methods enable evaluation of distribution properties that characterize the solution space via the computation of marginal event probabilities. We develop a Metropolis-Hastings sampling algorithm over level-sets which improves upon previous methods by allowing for topological changes (if desired) while simultaneously decreasing computational times by orders of magnitude.
|Value Independent Information Models|
Assessing the value of information for sensors in the context of distributed systems presents a challenging problem. Information driven approaches to active sensor management consider the expected information gain with regard to an explicit inference problem. However, for complex measurement models, simulation via Monte Carlo methods may be necessary to estimate the necessary quantities. This poses a computational bottleneck for large scale systems and planning over long time horizons.
In this project, we will examine the conditions that hold in a general family of distributions called the exponential family when entropy depends only on the size of the data and derive bounds for the entropy when the above conditions are not met before the acquirement of actual measurements.
|Mixture Model MCMC Inference|
We develop parallelizable samplers for Dirichlet process mixture models that do not require approximating the infinite model. Two sub-clusters are fit for each regular-cluster, and are used to propose large split and merge moves. Inference is shown to be orders of magnitude faster than traditional Gibbs sampling while being more robust to different initializations and hyper-parameters.
|Modeling Smoothly Varying Texture|
Utilizing the steerable pyramid of Simoncelli and Freeman  as a basis, we decompose textured regions of natural images into explicit local attributes of contrast, bias, scale, and orientation. Additionally, we impose smoothness on these attributes via Markov random fields. The combination allows for demonstrable improvements in common scene analysis applications including unsupervised segmentation, reflectance and shading estimation, and estimation of the radiometric response function from a single image.