Approximate Nearest Neighbors Methods for Learning and Vision

Trevor Darrell, Piotr Indyk, Greg Shakhnarovich
(MIT Computer Science and Artificial Intelligence Lab)
Paul Viola (Microsoft Research)
Dec 13


Presentation Abstracts

Organizers &


Example-based methods have been employed in machine learning for decades, and shown good performance. Their main drawback is the computational intensity of the search for nearest neighbors as the number of examples and especially the dimension of the example space become large. This has significantly diminished the appeal of these methods for modern problems, which deal with very large amounts of high-dimensional data. Computer vision is an area in which this is particularly true.

Recent research efforts in the theoretical and algorithmic community have produced algorithms for very fast approximate similarity search, which have the potential of making example-based learning feasible again. There has been growing interest in the learning and vision community in this topic, and in the last two years some published results have started to appear.


The main goal of this workshop is to try to formulate and analyze research agenda in this new area, and to identify important directions in which related research will move in the following years. By bringing together researchers from the communities working on machine learning, vision and theory we aim at developing common understanding of the potential and limitations of the newly available tools, what problems can benefit from these tools, and share the experience and insights that have been accumulating in the past few years.


Below is a non-exhaustive list of issues we hope to address at the workshop. Some of the topics have seen considerable attention, some are largely open.
  • The effect of approximation on the traditional estimation and classification methods. I.e., how does using approximate NN effect the known bounds on the risk of NN rule?
  • Similarity metrics for specific tasks. Should they be learned from data? How?
  • Advantages and disadvantages of example-based learning, especially in the presence of approximation.
  • Sampling complexity. How can one determine the appropriate amount of data required for useful approximate example-based learning?
  • Problem-specific approximate similarity search methods (in vision, text, speech etc.)
  • Different problems in similarity search (nearest neighbor versus point-in-ball location), and their importance to learning algorithms

Format and schedule

The schedule of the workshop is posted here. We plan to divide the time for this workshop roughly evenly between theory, learning, and perception. The time will be divided between presentations, moderated discussion and possibly a poster viewing session. Please contact the organizers if you would like to present a poster in such session.

gregory @