Robot Locomotion Group
The goal of our research is to build machines which exploit their natural dynamics to achieve extraordinary agility, efficiency, and robustness using rigorous tools from dynamical systems, control theory, and machine learning. Our current focus in on robotic manipulation, because the revolution in recent machine learning has opened a pathway in these applications to merging control theory and perception at a level that has never been considered before; ideas like "intuitive physics" and "common-sense reasoning" will meet with rigorous ideas like "model-order reduction" and "robust/adaptive control". It's going to be a great few years!
Our previous projects have included dynamics and control for humanoid robots, dynamic walking over rough terrain, flight control for aggressive maneuvers in unmanned aerial vehicles, feedback control for fluid dynamics and soft robotics, and connections between perception and control.
Offline optimization paradigms such as offline Reinforcement Learning (RL) or Imitation Learning (IL) allow policy search algorithms to make use of offline data, but require careful incorporation of uncertainty in order to circumvent the challenges of distribution shift. Gradient-based policy search methods are a promising direction due to their effectiveness in high dimensions; however, we require a more careful consideration of how these methods interplay with uncertainty estimation. We claim that in order for an uncertainty metric to be amenable for gradient-based optimization, it must be (i) stably convergent to data when uncertainty is minimized with gradients, and (ii) not prone to underestimation of true uncertainty. We investigate smoothed distance to data as a metric, and show that it not only stably converges to data, but also allows us to analyze model bias with Lipschitz constants. Moreover, we establish an equivalence between smoothed distance to data and data likelihood, which allows us to use score-matching techniques to learn gradients of distance to data. Importantly, we show that offline model-based policy search problems that maximize data likelihood do not require values of likelihood; but rather only the gradient of the log likelihood (the score function). Using this insight, we propose Score-Guided Planning (SGP), a planning algorithm for offline RL that utilizes score-matching to enable first-order planning in high-dimensional problems, where zeroth-order methods were unable to scale, and ensembles were unable to overcome local minima.
Supplemental materials: https://sites.google.com/view/score-guided-planning/home
Under review. Comments welcome.
Computing optimal, collision-free trajectories for high-dimensional systems is a challenging problem. Sampling-based planners struggle with the dimensionality, whereas trajectory optimizers may get stuck in local minima due to inherent nonconvexities in the optimization landscape. The use of mixed-integer programming to encapsulate these nonconvexities and find globally optimal trajectories has recently shown great promise, thanks in part to tight convex relaxations and efficient approximation strategies that greatly reduce runtimes. These approaches were previously limited to Euclidean configuration spaces, precluding their use with mobile bases or continuous revolute joints. In this paper, we handle such scenarios by modeling configuration spaces as Riemannian manifolds, and we describe a reduction procedure for the zero-curvature case to a mixed-integer convex optimization problem. We demonstrate our results on various robot platforms, including producing efficient collision-free trajectories for a PR2 bimanual mobile manipulator.
To appear at RSS 2023.
Decentralized learning has been advocated and widely deployed to make efficient use of distributed datasets, with an extensive focus on supervised learning (SL) problems. Unfortunately, the majority of real-world data are unlabeled and can be highly heterogeneous across sources. In this work, we carefully study decentralized learning with unlabeled data through the lens of self-supervised learning (SSL), specifically contrastive visual representation learning. We study the effectiveness of a range of contrastive learning algorithms under decentralized learning settings, on relatively large-scale datasets including ImageNet-100, MS-COCO, and a new real-world robotic warehouse dataset. Our experiments show that the decentralized SSL (Dec-SSL) approach is robust to the heterogeneity of decentralized datasets, and learns useful representation for object classification, detection, and segmentation tasks. This robustness makes it possible to significantly reduce communication and reduce the participation ratio of data sources with only minimal drops in performance. Interestingly, using the same amount of data, the representation learned by Dec-SSL can not only perform on par with that learned by centralized SSL which requires communication and excessive data storage costs, but also sometimes outperform representations extracted from decentralized SL which requires extra knowledge about the data labels. Finally, we provide theoretical insights into understanding why data heterogeneity is less of a concern for Dec-SSL objectives, and introduce feature alignment and clustering techniques to develop a new Dec-SSL algorithm that further improves the performance, in the face of highly non-IID data. Our study presents positive evidence to embrace unlabeled data in decentralized learning, and we hope to provide new insights into whether and why decentralized SSL is effective.
Preprint. Comments welcome.
The empirical success of Reinforcement Learning (RL) in contact-rich manipulation leaves much to be understood from a model-based perspective, where the key difficulties are often attributed to (i) the explosion of contact modes, (ii) stiff, non-smooth contact dynamics and the resulting exploding / discontinuous gradients, and (iii) the non-convexity of the planning problem. The stochastic nature of RL addresses (i) and (ii) by effectively sampling and averaging the contact modes. On the other hand, model-based methods have tackled the same challenges by smoothing contact dynamics analytically. Our first contribution is to establish the theoretical equivalence of the two smoothing schemes for simple systems, and provide qualitative and empirical equivalence on several complex examples. In order to further alleviate (ii), our second contribution is a convex, differentiable and quasi-dynamic formulation of contact dynamics, which is amenable to both smoothing schemes, and has proven through experiments to be highly effective for contact-rich planning. Our final contribution resolves (iii), where we show that classical sampling-based motion planning algorithms can be effective in global planning when contact modes are abstracted via smoothing. Applying our method on a collection of challenging contact-rich manipulation tasks, we demonstrate that efficient model-based motion planning can achieve results comparable to RL with dramatically less computation.
Supplemental materials: https://sites.google.com/view/global-planning-contact/home
Preprint. Comments welcome.
Trajectory optimization offers mature tools for motion planning in high-dimensional spaces under dynamic constraints. However, when facing complex configuration spaces, cluttered with obstacles, roboticists typically fall back to sampling-based planners that struggle in very high dimensions and with continuous differential constraints. Indeed, obstacles are the source of many textbook examples of problematic nonconvexities in the trajectory-optimization problem. Here we show that convex optimization can, in fact, be used to reliably plan trajectories around obstacles. Specifically, we consider planning problems with collision-avoidance constraints, as well as cost penalties and hard constraints on the shape, the duration, and the velocity of the trajectory. Combining the properties of Bézier curves with a recently-proposed framework for finding shortest paths in Graphs of Convex Sets (GCS), we formulate the planning problem as a compact mixed-integer optimization. In stark contrast with existing mixed-integer planners, the convex relaxation of our programs is very tight, and a cheap rounding of its solution is typically sufficient to design globally-optimal trajectories. This reduces the mixed-integer program back to a simple convex optimization, and automatically provides optimality bounds for the planned trajectories. We name the proposed planner GCS, after its underlying optimization framework. We demonstrate GCS in simulation on a variety of robotic platforms, including a quadrotor flying through buildings and a dual-arm manipulator (with fourteen degrees of freedom) moving in a confined space. Using numerical experiments on a seven-degree-of-freedom manipulator, we show that GCS can outperform widely-used sampling-based planners by finding higher-quality trajectories in less time.
Preprint. Comments welcome.
January 20, 2023. PhD Defense. Congratulations to Tao Pang for successfully defending his PhD thesis!
July 15, 2022. Award. Congratulations to Terry Suh, Max Simchowitz, and Kaiqing Zhang who's paper titled "Do differentiable simulators give better policy gradients?" was recognized with the "Outstanding Paper Award" at ICML 2022.
January 13, 2022. Award. Congratulations to Alexandre Amice, Hongkai Dai, Pete Werner, and Annan Zhang who's paper titled "Finding and Optimizing Certified, Collision-Free Regions in Configuration Space for Robot Manipulators" was recognized with the "Outstanding Paper Award" at WAFR 2022.
June 17, 2022. PhD Defense. Congratulations to Yunzhu Li for successfully defending his PhD thesis!
June 17, 2022. PhD Defense. Congratulations to Greg Izatt for successfully defending his PhD thesis!
August 15, 2020. Talks on Zoom. For better or worse, most research talks these days are now online. I've posted a handful of links to new talks, including Russ on Lex Fridman's AI Podcast, and at the IFRR Colloquium on the Roles of Physics-Based Models and Data-Driven Learning in Robotics.
July 20, 2020. PhD Defense. Congratulations to Lucas Manuelli for successfully defending his PhD thesis!
May 29, 2020. PhD Defense. Congratulations to Shen Shen for successfully defending her thesis!
September 18, 2019. PhD Defense. Congratulations to Twan Koolen for successfully defending his thesis!
August 19, 2019. PhD Defense. Congratulations to Pete Florence for successfully defending his thesis!
October 15, 2018. PhD Defense. Congratulations to Robin Deits for successfully defending his thesis!
October 3, 2018. Award. Congratulations to Pete Florence and Lucas Manuelli whose paper Dense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation won the Conference Best Paper Award at CoRL 2018!
September 19, 2018. Award. Congratulations to Pete Florence and Lucas Manuelli whose paper Dense Object Nets: Learning Dense Visual Object Descriptors By and For Robotic Manipulation won the first ever Amazon Robotics Best Technical Paper Award (2018).
June 18, 2018. Award. Congratulations to Ani Majumdar whose paper Funnel libraries for real-time robust feedback motion planning won the first ever International Journal of Robotics Research Paper of the Year (2017).
April 26, 2018. Award. Congratulations to Katy Muhlrad for winning the "Audience Choice Award" at the SuperUROP Showcase for her work on "Using GelSight to Identify Objects by Touch".
July 26, 2017. Defense. Frank Permenter successfully defended his thesis, titled "Reduction methods in semidefinite and conic optimization". Congratulations Frank!
May 19, 2017. Award. Pete Florence was awarded the EECS Masterworks award. Congratulations Pete!
May 19, 2017. Award. Sarah Hensley was awarded the 2017 Best SuperUROP Presentation award. Congratulations Sarah!
May 16, 2017. PhD Defense. Michael Posa successfully defended his thesis, titled "Optimization for Control and Planning of Multi-Contact Dynamic Motion". Congratulations Michael!