ICCV Poster From Gaze to Movement: Predicting Visual Attention for Autonomous Driving Human-Machine Interaction based on Programmatic Imitation Learning

Poster

From Gaze to Movement: Predicting Visual Attention for Autonomous Driving Human-Machine Interaction based on Programmatic Imitation Learning

Yexin Huang · Yongbin Lin · Lishengsa Yue · Zhihong Yao · Jie Wang

[ Abstract ]

Abstract: Human-machine interaction technology requires not only the distribution of human visual attention but also the prediction of the gaze point trajectory. We introduce $\textbf{PILOT}$, a programmatic imitation learning approach that predicts a driver’s eye movements based on a set of rule-based conditions. These conditions—derived from driving operations and traffic flow characteristics—define how gaze shifts occur. They are initially identified through incremental synthesis, a heuristic search method, and then refined via L-BFGS, a numerical optimization technique. These human-readable rules enable us to understand drivers’ eye movement patterns and make efficient and explainable predictions. We also propose $\textbf{DATAD}$, a dataset that covers 12 types of autonomous driving takeover scenarios, collected from 60 participants and comprising approximately 600,000 frames of gaze point data. Compared to existing eye-tracking datasets, DATAD includes additional driving metrics and surrounding traffic flow characteristics, providing richer contextual information for modeling gaze behavior. Experimental evaluations of PILOT on DATAD demonstrate superior accuracy and faster prediction speeds compared to four baseline models. Specifically, PILOT reduces the MSE of predicted trajectories by 39.91\% to 88.02\% and improves the accuracy of gaze object predictions by 13.99\% to 55.06\%. Moreover, PILOT achieves these gains with approximately 30\% lower prediction time, offering both more accurate and more efficient eye movement prediction.

Live content is unavailable. Log in and register to view live content