FMimic: Foundation models are fine-grained action learners from human videos
The International Journal of Robotics Research
Published online on October 17, 2025
Abstract
The International Journal of Robotics Research, Ahead of Print.
Visual imitation learning (VIL) provides an efficient and intuitive strategy for robotic systems to acquire novel skills. Recent advancements in foundation models, particularly vision language models (VLMs), have demonstrated remarkable capabilities in ...
Visual imitation learning (VIL) provides an efficient and intuitive strategy for robotic systems to acquire novel skills. Recent advancements in foundation models, particularly vision language models (VLMs), have demonstrated remarkable capabilities in ...