Workshop
Workshop on Benchmarking Multi-Target Tracking: Towards Spatiotemporal Action Grounding in Videos
Tanveer Hannan, Shuaicong Wu, Mark Weber, Suprosanna Shit, Rajat Koner, Jindong Gu, Aljosa Osep, Prof. Dr. Thomas Seidl, Prof. Dr. Laura Leal-Taixé
Sun 19 Oct, 11 a.m. PDT
The 8th BMTT Workshop focuses on action-aware multi-object tracking, aiming to unify temporal action localization and object tracking through natural language queries. While existing benchmarks often address these tasks separately, this workshop presents unified challenges to evaluate both capabilities. Participants are encouraged to develop models that can understand complex actions, follow detailed language instructions, and track multiple objects across time. The workshop aims to close the gap between vision and language, advancing multimodal video understanding and supporting research on scalable, real-world systems capable of fine-grained, action-driven reasoning in dynamic scenes.
Live content is unavailable. Log in and register to view live content