Off-road Autonomous Driving via Guided Reinforcement Learning

July 2025

Off-road Autonomous Driving via Guided Reinforcement Learning

Authors:

Vedant Mundheda

Abstract:

Off-road autonomous driving presents a complex set of challenges, including navigation through unmapped environments, variable terrain geometries, and uncertain, non-stationary dynamics. These conditions demand planning and control strategies that are both long-horizon and adaptable. Traditional Model Predictive Control (MPC) methods rely on dense sampling and precise dynamics modeling, which limits their feasibility for real-time planning in unstructured terrains. In contrast, Reinforcement Learning (RL) approaches offer fast execution but suffer from poor exploration efficiency, particularly in obstacle-dense and dynamically diverse settings.

This thesis proposes a hierarchical autonomy framework that integrates a low-frequency, long-horizon planner with a high-frequency, reactive RL-based controller. To overcome the exploration limitations of RL, the thesis introduces a novel teacher-student training paradigm. A teacher policy, trained off-policy using expert trajectories or heuristics, guides the learning process of a student policy trained on-policy. The thesis further extends the Proximal Policy Optimization (PPO) algorithm with a new hybrid policy gradient formulation that effectively leverages off-policy guidance alongside stable on-policy updates.

The proposed approach is validated in a realistic off-road simulation environment and benchmarked against standard RL and imitation learning baselines, showing improved terrain traversal and obstacle avoidance. Additionally, the trained policy is deployed on Sabrecat, a full-scale autonomous off-road ground vehicle. Experimental results demonstrate successful real-time execution, robust obstacle avoidance, and generalization to novel, complex terrains. This thesis contributes a practical and scalable solution to long-horizon off-road autonomy by combining hierarchical planning and guided reinforcement learning.

Notes:

@mastersthesis{Mundheda-2025-148198,
author = {Vedant Mundheda},
title = {Off-road Autonomous Driving via Guided Reinforcement Learning},
year = {2025},
month = {July},
school = {Carnegie Mellon University},
address = {Pittsburgh, PA},
number = {CMU-RI-TR-25-78},
}
Copyright notice: This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All persons copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.