Structured Policies for Efficient Knowledge-Guided Learning from Humans

May 2026

Structured Policies for Efficient Knowledge-Guided Learning from Humans

Authors:

Feiyu Zhu

Abstract:

Imitation learning has achieved strong performance in sequential decision-making tasks, but typically requires large numbers of expert demonstrations, has limited generalization capability in unseen scenarios, and is challenging for laypeople without technical backgrounds. This thesis introduces structured policies, a framework that integrates human domain knowledge into imitation learning by using large language models (LLMs) to generate semantically meaningful policy structures from natural language instructions while learning continuous parameters from demonstrations. By explicitly encoding task-relevant latent variables and their dependencies, structured policies focus on the essential causal structure of the expert policy, improving sample efficiency, robustness, and interpretability. We first present Knowledge Informed Models (KIM) that integrate expert domain knowledge and demonstrations in a straightforward way, and demonstrated its sample-efficient and robustness in continuous control domains such as Lunar Lander and Car Racing. We then present Interactive Policy Restructuring and Training (\interp{}), an interactive learning paradigm that allows end-users to iteratively provide instructions and demonstrations to refine the policy. And we show how it can learn dependable policies from laypeople through a user study. Together, the two projects show that structured policy is a promising way to integrate symbolic knowledge and continuous demonstrations for learning from human teachers.

Notes:

@mastersthesis{Zhu-2026-88292,
author = {Feiyu Zhu},
title = {Structured Policies for Efficient Knowledge-Guided Learning from Humans},
year = {2026},
month = {May},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-26-06},
keywords = {Neuro-Symbolic AI, Learning from Demonstrations},
}
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.