MetaTOC stay on top of your field, easily

Active reward learning and iterative trajectory improvement from comparative language feedback

, , , , , , , , , , , , , , , , ,

The International Journal of Robotics Research

Published online on

Abstract

The International Journal of Robotics Research, Ahead of Print.
Human-in-the-loop learning has gained traction in fields like robotics and natural language processing in recent years. While prior work mostly relies on human feedback in the form of preference comparisons, this feedback type has multiple limitations. It ...