Poster
LA-MOTR: End-to-End Multi-Object Tracking by Learnable Association
Peng Wang · Yongcai Wang · Hualong Cao · Wang Chen · Deying Li
This paper proposes LA-MOTR, a novel Tracking-by-Learnable-Association framework that resolves the competing optimization objectives between detection and association in end-to-end Tracking-by-Attention (TbA) Multi-Object Tracking. Current TbA methods employ shared decoders for simultaneous object detection and tracklet association, which often results in task interference and suboptimal accuracy. By contrast, our end-to-end framework decouples these tasks into two specialized modules: Separated Object-Tracklet Detection (SOTD) and Spatial-Guided Learnable Association (SGLA). This decoupled design offers flexibility and explainability. In particular, SOTD independently detects new objects and existing tracklets in each frame, while SGLA associates them via Spatial-Weighted Learnable Attention module guided by relative spatial cues. Temporal coherence is further maintained through Tracklet Updates Module. The learnable association mechanism resolves the inherent suboptimal association issues in decoupled frameworks, avoiding the task interference commonly observed in joint approaches. Evaluations on DanceTrack, MOT17, and SportMOT datasets demonstrate state-of-the-art performance. Extensive ablation studies validate the effectiveness of the designed modules. Code will be publicly available.
Live content is unavailable. Log in and register to view live content