ICCV Poster ROADWork: A Dataset and Benchmark for Learning to Recognize, Observe, Analyze and Drive Through Work Zones

Poster

ROADWork: A Dataset and Benchmark for Learning to Recognize, Observe, Analyze and Drive Through Work Zones

Anurag Ghosh · Shen Zheng · Robert Tamburo · Khiem Vuong · Juan Padilla · Hailiang Zhu · Nicholas Dunn · Michael Cardei · Christoph Mertz · Srinivasa Narasimhan

[ Abstract ]

Abstract:

Perceiving and navigating autonomously through work zones is a challenging and underexplored problem. Open datasets for developing algorithms for this long-tailed scenario are scarce. We propose the ROADWork dataset to learn to recognize, observe, analyze, and drive through work zones. State-of-the-art foundation models perform poorly when applied to work zones. Fine-tuning models on our dataset significantly improves perception and navigation in work zones. With ROADWork, we discover new work zone images with higher precision (+32.5%) at a much higher rate (12.8×) around the world. Open-vocabulary methods fail on work zones, whereas detectors fine-tuned on our data improve performance (+32.2 AP). Vision-Language Models (VLMs) struggle to describe work zones, but fine-tuning substantially improves performance (+36.7 SPICE). Beyond fine-tuning, we show the value of simple techniques: Video label propagation provides additional gains (+2.6 AP). While reading work zone signs, composing a work zone detector and text spotter through crop-scaling improves performance (+14.2% 1-NED). Composing work zone detections to provide context further reduces hallucinations (+3.9 SPICE) in VLMs. We compute drivable paths from work zone navigation videos and predict navigational goals and pathways. Incorporating road work semantics ensures 53.6% goals have angular error (AE) < 0.5 degrees (+9.9%) and 75.3% pathways have AE < 0.5 degrees (+8.1%).

Live content is unavailable. Log in and register to view live content