Skip to yearly menu bar Skip to main content


Poster

ForeSight: Multi-View Streaming Joint Object Detection and Trajectory Forecasting

Sandro Papais · Letian Wang · Brian Cheong · Steven Waslander


Abstract:

We introduce ForeSight, a novel joint detection and forecasting framework for vision-based 3D perception in autonomous vehicles. Traditional approaches treat detection and forecasting as separate sequential tasks, limiting their ability to leverage temporal cues from past forecasts. ForeSight addresses this limitation with a multi-task streaming and bidirectional learning approach, allowing detection and forecasting to share query memory and propagate information seamlessly. The forecast-aware detection transformer enhances spatial reasoning by integrating trajectory predictions from a multiple hypothesis forecast memory queue, while the streaming forecast transformer improves temporal consistency using past forecasts and refined detections. Unlike tracking-based methods, ForeSight eliminates the need for explicit object association, reducing error propagation with a tracking-free model that efficiently scales across multi-frame sequences. Experiments on the nuScenes dataset show that ForeSight achieves state-of-the-art performance, achieving an EPA of 54.9\%, surpassing previous methods by 9.3\%, while also attaining the highest mAP among multi-view detection models and maintaining competitive motion forecasting accuracy.

Live content is unavailable. Log in and register to view live content