Efficient Interactive Learning: Learning More From Less

April 2026

Efficient Interactive Learning: Learning More From Less

Authors:

Gokul Swamy

Abstract:

Even as we near the limits of what human-generated data we can scrape from the Internet, today’s decision-making agents — from robots to large language models (LLMs) — are still far from perfect. Thus, as we start to move past the era of simply scaling up training datasets, an increasingly urgent question that cuts across much of decision-making is how we can learn more from less data.

In theory, agents collecting their own data and learning from this experience might allow us to transcend the limits of static datasets. However, there are two core challenges that have made delivering on this promise of reinforcement learning (RL) practically challenging. The first is exploration: experiencing the right outcomes. The second is specification: knowing if what you experienced was good or bad. In short, this thesis focuses on algorithms that address both of these challenges in tandem: on making good decisions efficiently, even when ``good'' is hard to specify.

In greater detail, we begin by formalizing when interaction is necessary and sufficient for learning performant decision-making policies (Part I). We then derive imitation learning algorithms that learn recovery behavior without needing to solve global exploration problems (Part II). Next, we derive preference fine-tuning algorithms that robustly handle inconsistent feedback that complicates specification (Part III). Lastly, we present both theoretical and practical evidence that learning reward models and performing RL rather than directly learning policies allows us to squeeze more out of the same amount of data due to generation-verification gaps (Part IV). Put together, this thesis argues that interactive learning presents a particularly promising path towards more capable decision-making agents and makes progress the two key bottlenecks required to deliver on this promise.

Notes:

@phdthesis{Swamy-2026-88269,
author = {Gokul Swamy},
title = {Efficient Interactive Learning: Learning More From Less},
year = {2026},
month = {April},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-26-35},
}
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.