May
2026
Learning Generalizable Robot Skills from Diverse Data Sources and Modalities
Authors:
Abstract:
Robust robot behavior in real-world environments requires generalization across diverse objects, scenes, and embodiments despite limited training data. This thesis studies how different sources and modalities of data can improve different forms of robot generalization. It explores three complementary directions: force information for object-level generalization in contact-rich manipulation, human demonstration data for environment- and embodiment-level generalization, and large-scale simulation for environment-level generalization in reactive motion generation.
First, this thesis presents FACTR, a force-aware imitation learning framework that combines a low-cost bilateral teleoperation system with a force-attending curriculum, improving generalization to unseen objects in contact-rich tasks. Second, it presents DexWild, a scalable framework that uses co-training on human and robot demonstrations to improve generalization to unseen environments while reducing robot-specific data requirements and supporting transfer across embodiments. Third, it presents Deep Reactive Policy (DRP), a simulation-trained framework for reactive motion generation that transfers zero-shot to the real world and achieves robust performance in complex dynamic environments. Together, these results show leveraging various sources and modalities of data leads to policies exhibiting different modes of generalization.
First, this thesis presents FACTR, a force-aware imitation learning framework that combines a low-cost bilateral teleoperation system with a force-attending curriculum, improving generalization to unseen objects in contact-rich tasks. Second, it presents DexWild, a scalable framework that uses co-training on human and robot demonstrations to improve generalization to unseen environments while reducing robot-specific data requirements and supporting transfer across embodiments. Third, it presents Deep Reactive Policy (DRP), a simulation-trained framework for reactive motion generation that transfers zero-shot to the real world and achieves robust performance in complex dynamic environments. Together, these results show leveraging various sources and modalities of data leads to policies exhibiting different modes of generalization.
Notes:
copied = false, 2000);
">
@mastersthesis{Liu-2026-88299,
author = {Jason Jingzhou Liu},
title = {Learning Generalizable Robot Skills from Diverse Data Sources and Modalities},
year = {2026},
month = {May},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-26-24},
keywords = {Robot Learning, Imitation Learning, Generalization},
}
author = {Jason Jingzhou Liu},
title = {Learning Generalizable Robot Skills from Diverse Data Sources and Modalities},
year = {2026},
month = {May},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-26-24},
keywords = {Robot Learning, Imitation Learning, Generalization},
}