Current Projects

Past Projects

Current Projects

Micro-Air Vehicle Navigation and Control

Our group has developed some of the first fully autonomous micro-aerial vehicle (MAV) systems capable of self-directed exploration in both GPS-Denied and communications-denied environments. The absence of GPS and high-bandwidth communications links limits the sensing and processing available to the MAV to only that which can be carried onboard the vehicle.

An example (see here for video) of our past work includes a fixed wing vehicle capable of localizing itself in a known map and flying autonomously using only a 2D laser scanner and inertial measurement unit (IMU), with all processing performed onboard the vehicle.

Our recent work (STAR, MLM) has focused on using small, inexpensive cameras to estimate the pose of the MAV and the geometric structure of the environment to enable high-speed autonomous flight.


  • K. Ok, W. N. Greene, and N. Roy. "Simultaneous Tracking and Rendering: Real-time Monocular Localization for MAVs" Proceedings of the International Conference on Robotics and Automation (ICRA), Stockholm, 2016.
    [PDF] [Video]
  • W. N Greene, K. Ok, P. Lommel, and N. Roy. "Multi-Level Mapping: Real-time Dense Monocular SLAM." Proceedings of the International Conference on Robotics and Automation (ICRA), Stockholm, 2016.
    [PDF] [Video]
  • K. Ok, D. Gamage, T. Drummond, F. Dellaert, and N. Roy. "Monocular Image Space Tracking on a Computationally Limited MAV." Proceedings of the International Conference on Robotics and Automation (ICRA), Seattle, 2015.
  • A. Bry, C. Richter, A. Bachrach and N. Roy. "Aggressive Flight of Fixed-Wing and Quadrotor Aircraft in Dense Indoor Environments". International Journal of Robotics Research, 37(7):969-1002, June 2015.
    [PDF] [Bibtex Entry]
  • Charles Richter, Adam Bry, Nicholas Roy. (2013). "Polynomial Trajectory Planning for Aggressive Quadrotor Flight in Dense Indoor Environments." International Symposium of Robotics Research (ISRR), Singapore, 2013.
    [PDF] [BiBTeX Entry]
  • A. Bachrach, S. Prentice, R. He, P. Henry, A. S. Huang, M. Krainin, D. Maturana, D. Fox and N. Roy. "Estimation, Planning and Mapping for Autonomous Flight Using an RGB-D Camera in GPS-denied Environments". International Journal of Robotics Research, 31(11):1320-1343, September 2012.
  • Adam Bry, Abraham Bachrach and Nicholas Roy. "State Estimation for Aggressive Flight in GPS-Denied Environments Using Onboard Sensing". Proceedings of the IEEE International Conference on Robotics and Automation (ICRA). St Paul, MN 2012. (Nominated, best conference paper.).
    [PDF] [BiBTeX Entry]
  • A. Huang, A. Bachrach, P. Henry, M. Krainin, D. Maturana, D. Fox, N. Roy. "Visual Odometry and Mapping for Autonomous Flight Using an RGB-D Camera", Proceedings of the International Symposium of Robotics Research (ISRR), Flagstaff, AZ, 2011.
    [PDF] [Bibtex Entry]
  • A. Bachrach, S. Prentice, R. He, and N. Roy. "RANGE - Robust Autonomous Navigation in GPS-denied Environments". Journal of Field Robotics. 28(5):646-666, September 2011.
    [Compressed postscript] [PDF] [Bibtex Entry]
  • A. Bachrach, R. He, N. Roy. "Autonomous Flight in Unknown Indoor Environments". International Journal of Micro Air Vehicles, 1(4): 217-228, December 2009.
  • Ruijie He, Sam Prentice and Nicholas Roy. "Planning in Information Space for a Quadrotor Helicopter in a GPS-denied Environments''. Proceedings of the IEEE International Conference on Robotics and Automation (ICRA 2008). Los Angeles, 2008.
  • Abraham Bachrach, Alborz Garamifard, Daniel Gurdan, Ruijie He, Sam Prentice, Jan Stumpf and Nicholas Roy. "Co-ordinated Tracking and Planning using Air and Ground Vehicles''. In Proceedings of the International Symposium on Experimental Robotics (ISER), Athens, 2008.

Learning for Highly Dynamic Planning and Control

Navigation in unknown environments is difficult. As an autonomous robot explores an environment it has never seen, it must construct a map as it travels and replan its route as obstacles are discovered. Traditional planning algorithms require that the robot avoid situations that might cause it to crash, and therefore treat unobserved space as potential unseen obstacles. Naturally, navigation can be slow in cluttered spaces or even hallways, in which most robots must slow dramatically whenever rounding a corner.

Some of our recent work has involved using machine learning to capture (at training time) the local environmental geometry so that we may predict (at run time) the probability that taking an action that guides it into unknown space will cause the robot to collide. Consequently, though the robot is now allowed to violate strict safety constraints, we maintain 100% empirical safety, yet see impressive improvements in speed (see here for video). Relatedly, we have also developed an adaptation of this technique that re-introduces empirical safety guarantees. The learning algorithm instead predicts which actions will give the robot an information gain so that it may reach the goal faster (e.g. taking wider turns around corners so that it need not slow down as much).

Our ongoing research direction involves using learning to augment the performance of autonomous planning and control tasks, including intelligent navigation of more topologically complex environments and using monocular camera images to predict collision probability at high speeds.

  • Charlie Richter and Nicholas Roy. "Bayesian Learning for Safe High-Speed Navigation in Unknown Environments''. In Proceedings of the International Symposium on Experimental Robotics (ISER), Tokyo, 2016. [PDF]
  • Charlie Richter, William Vega-Brown, and Nicholas Roy. "Bayesian Learning for Safe High-Speed Navigation in Unknown Environments''. In Proceedings of the International Symposium on Robotics Research (ISRR), Sestri Levante, 2015. [PDF]

Natural Language Understanding for Human-Robot Interaction

Advances in robot autonomy have moved humans to a different level of interaction, where the ultimate success hinges on how effectively and intuitively humans and robots can work together to correctly accomplish a task. However, most service robots currently require fairly detailed low-level guidance from a trained operator, which often leads to constrained and non-intuitive interaction.

Alternatively, natural language provides a rich, intuitive and flexible medium for humans and robots to communicate information. Our goal is to enable robots to understand natural language utterances in the context of their workspaces. We seek algorithmic models that bridge the semantic gap between high-level concepts (e.g. entities, events, routes, etc.) embedded in language utterances and their low-level metric representations (e.g. cost maps and point clouds) necessary for a robot to act in the world.

We have developed probabilistic models like Generalized Grounding Graphs and Distributed Correspondence Graphs to infer a grounding for language descriptions in the context of the agent’s perceived representation. In recent work, we introduced Adaptive Distributed Correspondence Graphs for efficient reasoning about abstract spatial concepts.

Our ongoing research focuses on acquiring semantic knowledge about the environment from observations or language descriptions. This allows the robot to ground commands that refer to past events or acquired factual knowledge. Another area of research addresses language understanding for specifying high-level tasks like search and rescue operations. Further, we are also investigating language understanding in partially known environments and exploration strategies for acquiring new and unknown concepts.

  • R. Paul, J. Arkin, N. Roy, T. M. Howard, “Efficient Grounding of Abstract Spatial Concepts for Natural Language Interaction with Robot Manipulators”. Robotics Science and Systems 2016. (Best conference paper)
  • Thomas Howard, Stefanie Tellex, Nicholas Roy. "A Natural Language Planner Interface for Mobile Manipulators." International Conference on Robotics and Automation (ICRA), Hong Kong, 2014.
  • Stefanie Tellex, Ross Knepper, Adrian Li, Daniela Rus and Nicholas Roy. "Asking for Help Using Inverse Semantics." Proceedings of Robotics Science and Systems (RSS), Berkeley, CA, 2014. (Best conference paper)
  • S. Tellex, T. Kollar, S. Dickerson, M. R. Walter, A. G. Banerjee, S. Teller, N. Roy. "Understanding Natural Language Commands for Robotic Navigation and Mobile Manipulation", Proceedings of the National Conference on Artificial Intelligence (AAAI), San Francisco, CA, 2011.

Enabling Semantic Understanding for Autonomous Marine Robots

The oceans cover over 70% of the Earth’s surface, yet less than five percent of this important biosphere has been explored to date. Much of the vast marine environment is dangerous or inaccessible to human divers. Thus, the task of exploring Earth’s oceans will fall to marine robots. However, the development of exploratory marine robots has been stymied by the marine environment's unique challenges. The lack of radio communication forces all human control to pass through high latency, low-bandwidth acoustic channels or hardwire tethers. These conditions necessitate the development of comprehensive and robust robot autonomy.

Our work in this area is split between two complementary thrusts: 1) learning an abstract representation of underwater image data that is conducive to semantic reasoning, and 2) using that abstract representation to build probabilistic models of the robot’s visual environment that allow more efficient exploratory path planning, anomaly detection, and mission data summarization.

We plan to address the problem of learning a meaningful feature representation of underwater images using deep learning. Even given the impressive performance of deep learning algorithms on computer vision problems, this is still challenging. Underwater images are very visually distinct from standard image datasets and there are no large corpora of labeled underwater image data available. Our current research direction involves using unsupervised convolutional autoencoders or minimally-supervised transfer learning frameworks to learn a latent feature representation of underwater image data.

Result of running HDP spatiotemporal topic model on image data from marine robot mission. Visually distinct terrains are clustered into different topics (colors).

Given this abstract feature representation, we are applying various probabilistic models to represent the robot’s knowledge about the observable world. Topic models provide a natural probabilistic framework for both anomaly detection and data summarization. Much of our previous work has focused on extending the Hierarchical Dirichlet Process (HDP), a Bayesian nonparametric topic model, to the real-time, spatiotemporal image data from a marine robot’s video stream. Our ongoing research direction involves building more sophisticated hierarchical topic models that allow a robot to understand the environment at multiple levels of abstraction.

Wind Field Estimation and Planning

A wind field estimate across the MIT campus.

With unmanned aerial vehicles (UAVs) becoming more prolific and capable, and regulations evolving, their eventual operation in urban environments seems all but certain. As UAVs begin to fly in these environments, they will be presented with a host of unique challenges. One of these challenges will be the complex wind fields generated by urban structures and terrain. Although much effort has been directed towards developing planning and estimation strategies for wind fields at high altitudes or in large open spaces, these approaches contain an implicit assumption that the wind field evolves over relatively large temporal and spatial scales. Given this simplification, a history of local measurements can be used to estimate the global wind field with sufficient accuracy. However, urban wind fields are highly variant in both space and time and are therefore resistant to this estimation method and require an approach that models the complex interaction between the flow and surrounding environment.

Our approach is to use prevailing wind estimates from local weather stations and a 3D model of the surrounding environment as inputs to a computational fluid dynamics solver to obtain both steady and unsteady wind field estimates. Unlike many approaches, these wind field estimates account for the strong coupling between the wind flow and nearby structures. Once obtained, these wind field estimates can be used to find minimum-energy trajectories between points of interest. Further work hopes to leverage a library of precomputed wind fields to find a wind field covariance estimate within a region. This uncertainty estimate could be used to infer a global wind field from local measurements, or predict future wind conditions.

Past Projects

Understanding Natural Language Commands

Our system understands commands
such as "Pick up the tire pallet off
the truck and set it down."

Natural language is an intuitive and flexible modality for human-robot interaction. A robot designed to interact naturally with humans must be able to understand instructions without requiring the person to speak in any special way. We are building systems that robustly understand natural language commands produced by untrained users. We have applied our work to understanding spatial language commands for a robotic wheelchair, a robotic forklift, as well as a micro-air vehicle. More information is at

  • Thomas Kollar, Stefanie Tellex, Deb Roy and Nicholas Roy. "Grounding Verbs of Motion in Natural Language Commands to Robots", International Symposium on Experimental Robotics (ISER), New Delhi, India, Dec. 2010. [PDF]
  • Stefanie Tellex, Thomas Kollar, George Shaw, Nicholas Roy, and Deb Roy. "Grounding Spatial Language for Video Search," Proceedings of the Twelfth International Conference on Multimodal Interfaces (ICMI), 2010. (Winner, Best Student Paper award.) [PDF]
  • Thomas Kollar, Stefanie Tellex, Deb Roy and Nick Roy, "Toward understanding natural language directions," Human-Robot Interaction 2010. [PDF]

Human-Robot Interaction for Assistive Robots

We are developing planning and learning algorithms that can be used to optimize awheelchair dialogue manager for a human-robot interaction system. The long-term goal of this research is to develop intelligent assistive technology, such as a robotic wheelchair, that can be used easily by an untrained population. We are working with the residents and staff of The Boston Home, a specialized care residence for adults with advanced multiplesclerosis and other progressive neurological diseases, to develop an intelligent interface to the residents' wheelchairs. An adaptive, intelligent dialogue manager will be essential for allowing a diverse population with a variety of physical and communication impairments to interact with the system.

  • F. Doshi and N. Roy. "Spoken Language Interaction with Model Uncertainty: An Adaptive Human-Robot Interaction System''. Connection Science, To appear.
  • Finale Doshi and Nicholas Roy. "The Permutable POMDP: Fast Solutions to POMDPs for Preference Elicitation''. Proceedings of the Seventh International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2008). Estoril, Portugal, 2008.

Planning Under Uncertainty

Continuous-state POMDPs provide a natural representation for a variety of tasks, including many in robotics. However, existing continuous-state POMDP approaches are limited by their reliance on a single linear model to represent the world dynamics. We have developed new switching-state (hybrid) dynamics models that can represent multi-modal state-dependent dynamics, and a new point-based POMDP planning algorithm for solving continuous-state POMDPs using this dynamics model. Additionally, POMDPs have succeeded in many planning domains because they can optimally trade between actions that increase an agent's knowledge and actions that increase an agent's reward. Unfortunately, most real-world POMDPs are defined with a large number of parameters which are difficult to specify from domain knowledge alone.

We have shown that the POMDP model parameters can be incorporated as additional hidden states in a larger 'model-uncertainty' POMDP, and we have developed an approximate algorithm for planning in the induced `model-uncertainty' POMDP. This approximation, coupled with model-directed queries, allows the planner to actively learn the true underlying POMDP and the accompanying policy.

  • E. Brunskill, L. Kaelbling, T. Lozano-Perez and Nicholas Roy. "Continuous-State POMDPs with Hybrid Dynamics''. Proceedings of the Tenth International Symposium on Artificial Intelligence and Mathematics, Fort Lauderdale, FL, 2008.
  • F. Doshi, J. Pineau and N. Roy. "Bayes Risk for Active Learning in POMDPs''. Proceedings of the International Conference on Machine Learning (ICML), Helsinki, Finland, 2008, pp. 256-263.


Mapping as a research problem has received considerable attention in robotics recently. Mature mapping techniques now allow practitioners to reliably and consistently generate 2-D and 3-D maps of objects, office buildings, city blocks and metropolitan areas with a comparatively small number of errors. Nevertheless, the ease of construction and quality of map are strongly dependent on the exploration strategy used to acquire sensor data. We have shown that reinforcement learning can be used to optimize the trajectory of a vehicle exploring an unknown environment. One of the primary technical challenges of exploration is being able to predict the value of different sensing strategies efficiently. We have shown that a robot can learn the effect of sensing strategies from past experience using kernel-based regression techniques. The local regression model can then be used inside a global planner to optimize a trajectory. We have demonstrated this technique both for a mobile robot building a map of an unknown environment, and an airborne mobile sensor collecting data for weather prediction.

  • T. Kollar and N. Roy. "Trajectory Optimization using Reinforcement Learning for Map Exploration''. International Journal of Robotics Research, 27(2): 175-197, 2008.
  • N. Roy, H. Choi, D. Gombos, J. Hansen, J. How and S. Park. "Adaptive Observation Strategies for Forecast Error Minimization''. Proceedings of the International Conference on Computational Science, Beijing, 2007.

Mobile Manipulation

Robot manipulators largely rely on complete knowledge of object geometry in order to plan their motion and compute successful grasps. If an object is fully in view, the object geometry can be inferred from sensor data and a grasp computed directly. If the object is occluded by other entities in the environment, manipulation based on the visible part of the object may fail; therefore, to compensate, object recognition is often used to identify the location of the object and compute the grasp from a prior model. We are developing algorithms for geometric inference and manipulation planning that allow grasp plans to be computed with only partial information about the objects in the environment and their geometry. We are developing these ideas both for small-object manipulation in the home, and large-object supply-chain manipulation.

  • J. Glover, D. Rus and N. Roy. "Manipulation using Probabilistic Models of Object Geometry''. Proceedings of Robotics: Science and Systems (R:SS), 2008.