Multimodal spatial language maps for robot navigation and manipulation

Chenguang Huang, Oier Mees, Andy Zeng, Wolfram Burgard, University of Technology Nuremberg, UC Berkeley, Google Research

The International Journal of Robotics Research

Published online on July 28, 2025

Abstract

The International Journal of Robotics Research, Ahead of Print.
Grounding language to a navigating agent’s observations can leverage pretrained multimodal foundation models to match perceptions to object or event descriptions. However, previous approaches remain disconnected from environment mapping, lack the spatial ...