Dreaming to Assist:
Learning to Align with Human Objectives for Shared Control in High-Speed Racing

CoRL 2024

Jonathan DeCastro*,  Andrew Silva*,  Deepak Gopinath,  Emily Sumner,  Thomas M. Balch,  Laporsha Dees,  Guy Rosman
Toyota Research Institute *Equal contribution

Overview. Dream2Assist combines a rich world model able to infer human objectives and value functions from driving behaviors, and an assistive agent that provides appropriate expert assistance to a given human teammate.

Abstract

Tight coordination is required for effective human-robot teams in domains involving fast dynamics and tactical decisions, such as multi-car racing. In such settings, robot teammates must react to cues of a human teammate's tactical objective to assist in a way that is consistent with the objective (e.g., navigating left or right around an obstacle). To address this challenge, we present Dream2Assist, a framework that combines a rich world model able to infer human objectives and value functions, and an assistive agent that provides appropriate expert assistance to a given human teammate. Our approach builds on a recurrent state space model to explicitly infer human intents, enabling the assistive agent to select actions that align with the human and enabling a fluid teaming interaction. We demonstrate our approach in a high-speed racing domain with a population of synthetic human drivers pursuing mutually exclusive objectives, such as "stay-behind" and "overtake". We show that the combined human-robot team, when blending its actions with those of the human, outperforms synthetic humans alone and several baseline assistance strategies, and that intent-conditioning enables adherence to human preferences during task execution, leading to improved performance while satisfying the human's objective.

System Architecture

Training of Dream2Assist. Seveal pre-trained human models are trained to generate human-like driving behaviors with different objectives, which are used to train the world model of the assistant and the assistive agent policy. The world model is trained to infer human objectives and reward function in order to provide expert assistance to a human teammate. The assistive agent is trained to provide assistance that aligns with the human's objectives, enabling a fluid teaming interaction.

Qualitative Results

Examples of the Dream2Assist agent's actions when paired with a human intending to pass and a human intending to stay. Dream2Assist recognizes the driver's intent, making lateral corrections for a safer overtake (left) or throttle adjustments to stay behind the opponent while still progressing towards the finish (right), thereby helping to satisfy task and human objectives.

Skill-Based Results

Change in track progress and return when adding assistance to five imperfect pass (left) and stay (right) fictitious humans in the hairpin and straightaway problem settings. The addition of Dream2Assist leads to higher gains in task performance and greater adherence to human objectives than baselines.

Videos

Poster

Citation

Please cite the paper as follows:

@inproceedings{decastro2024dream2assist, title={Dreaming to Assist: Learning to Align with Human Objectives for Shared Control in High-Speed Racing}, author={Jonathan DeCastro and Andrew Silva and Deepak Gopinath and Emily Sumner and Thomas M. Balch and Laporsha Dees and Guy Rosman}, booktitle={8th Conference on Robot Learning (CoRL) 2024. Munich, Germany}, year={2024} }