Visual Recognition and Tracking for Perceptive Interfaces
Devices should be perceptive, and respond directly to their human user and/or environment. In this talk I’ll present new computer vision algorithms for fast recognition, indexing, and tracking that make this possible, enabling multimodal interfaces which respond to users’ conversational gesture and body language, robots which recognize common object categories, and mobile devices which can search using visual cues of specific objects of interest. As time permits, I’ll describe recent advances in real-time human pose tracking for multimodal interfaces, including new methods which exploit fast computation of approximate likelihood with a pose-sensitive image embedding. I’ll also present our linear-time approximate correspondence kernel, the Pyramid Match, and its use for image indexing and object recognition, and discovery of object categories. Throughout the talk, I’ll show interface examples including grounded multimodal conversation as well as mobile image-based information retrieval applications based on these techniques.
Speaker Bios
Trevor Darrell is an Associate Professor of Electrical Engineering and Computer Science at M.I.T. He leads the Vision Interface Group at the Computer Science and Artificial Intelligence Laboratory. His interests include computer vision, interactive graphics, and machine learning. Prior to joining the faculty of MIT he worked as a Member of the Research Staff at Interval Research in Palo Alto, CA, researching vision-based interface algorithms for consumer applications. He received his PhD and SM from MIT in 1996 and 1991, respectively, while working at the Media Laboratory, and the BSE from the University of Pennsylvania in 1988, where he worked in the GRASP Robotics Laboratory.
- Date:
- Haut-parleurs:
- Trevor Darrell
- Affiliation:
- MIT CSAIL
-
-
Jeff Running
-
-