My research interests lie in the intersection between AI and robotics. I am interested in technologies for creating robots with cognitive abilities such that robots can co-exist with humans in shared environments, learning to improve themselves over time through continuous on- and off-line training, exploration, and interactions with humans and/or environments. Towards this general goal, my current research is focused on establishing the notion of “semantic understanding,” an ability to perceive semantic meanings of visual and language inputs and infer further information from them to devise plans for the future.
The growth of social media and crowdsourcing platforms has opened access to compound descriptions for images in addition to simple labels, e.g., textual descriptions associated with an image posted on social media may contain con- textual information beyond the labels of image regions. An ability to understand this type of context-rich information in a perception system can be extremely useful in problem domains such as disaster response where humanitarian volunteers assess damages by looking through a plethora of images of an affected area and textual descriptions from social media. In addition, robots interacting with humans will need to be able to understand natural language to integrate what has been seen and what has been told and express their understanding in natural language to communicate back to humans. This level of intelligence would enable human teammates to easily interact with complex robots without requiring customized interfaces or special training.
In this context, I view semantic understanding as a problem of information translation among different modalities. My current direction is specifically focused on the translation of information among vision, language, and planning so that robots can 1) perceive richer meanings of a scene by augmenting vision with verbal information, 2) describe in natural language what they have ob- served, 3) plan to navigate to follow complex directions, and 4) explain their past, current, and future plans. By necessity, my research is interdisciplinary in nature across various fields of AI and robotics including natural language understanding, computer vision, machine learning, and planning.
This work is conducted in part through collaborative participation in the Robotics Consortium sponsored by the U.S Army Research Laboratory under the Collaborative Technology Alliance Program, Cooperative Agreement W911NF-10-2-0016. The RCTA consortium is composed of CMU, University of Pennsylvania, University of Central Florida, Florida State University, Massachusetts Institute of Technology, Jet Propulsion Laboratory, General Dynamics Robotic Systems and Robotic Research, LLC.
Christian Lebiere (Psychology) - cognitive architecture
Katharina Muelling (RI) - manipulation, learning
Junjie Hu (LTI) - natural language understanding
Sz-Rung Shiang (LTI)- language-vision fusion, multimodal information fusion
Anirudh Vemula (RI) - social navigation, path planning in dynamic environments
Anatole Gershman, Jaime Carbonell (LTI) - natural language processing
Adi Nemlekar (NREC) - plan verification
Sagar Chaki, David Kyle, Scott Hissam (SEI) - RCTA-SEI collaboration on statistical model checking for robot plan verification
Oscar Romero (MLD, CMU), Jerry Vinokrov, Alessandro Oltramari (Bosch), Unmesh Kurup (Bosch), Felix Duvallet (EPFL), Paul Vernaza (NEC), Abdeslam Boularias (Rutgers), Jackie Libby, Tony Stentz (Uber ATC), Stephanie Rosenthal (SEI)
Past students and interns:
Michael Jason Gnanasekar, Summer 2016 (Facebook)
Kevin Zhang, Summer 2016 (ECE, CMU)
Sonia Appasamy, Summer 2015 (CS, Cornell)
Bikramjot Hanzra, Shiyu Dong, Tae-Hyung Kim, Tushar Chugh, William Seto, RI 16-662: Robot Autonomy class project, Spring 2016 (M.S. in Robotic Systems Development, CMU)