Abstract:
Robot learning is fundamentally data-constrained. Internet-scale human data is a promising source of additional data about human environments, tasks, and common skills. This data comes in diverse representations, such as human videos and Large Language Models, and can provide supervision and priors at all levels of robot reasoning and control. However, leveraging this data for robot learning is not trivial. Regardless of the representation, human data often lacks physical details and contains drastically different embodiments and environments from our target robot deployments. In this thesis, I argue that one of the keys to effectively leveraging human data for robot learning lies in identifying appropriate features and modalities in this human data for each level of robot reasoning. My work on robot learning from human data follows a three-step process: analyze deployed robot learning systems augmented with human data on common tasks, identify appropriate features and failure modes, and integrate models trained on these appropriate features into existing robot learning and reasoning paradigms. This thesis demonstrates these strategies across both task planning and skill learning domains. In task planning, we identify high-level language representations of visual details as good features and design an approach that leverages Bayesian reasoning about information gain to better ground LLM-based planners to their environment. In skill learning, we identify visual motion representations as good features and present an approach that uses dense reward signals learned from human video to rapidly improve robot performance in real-world experiments. In addition to the specific methods and approaches presented here, the work in this thesis offers a general strategy for robot learning from diverse human data.
Notes:
copied = false, 2000);
">
@phdthesis{Verghese-2026-88273,
author = {Mrinal Verghese},
title = {Strategies for Robot Learning from Human Data},
year = {2026},
month = {April},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-26-34},
keywords = {Robot Learning, Task Planning, Large Language Models, Learning from Demonstration, Reinforcement Learning},
}
author = {Mrinal Verghese},
title = {Strategies for Robot Learning from Human Data},
year = {2026},
month = {April},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-26-34},
keywords = {Robot Learning, Task Planning, Large Language Models, Learning from Demonstration, Reinforcement Learning},
}