ICCV Poster AnnofreeOD: Detecting All Classes at Low Frame Rates Without Human Annotations

Poster

AnnofreeOD: Detecting All Classes at Low Frame Rates Without Human Annotations

Boyi Sun · Yuhang Liu · Houxin He · Yonglin Tian · Fei-Yue Wang

Exhibit Hall I #489

[ Abstract ]

Tue 21 Oct 6:15 p.m. PDT — 8:15 p.m. PDT

Abstract: Manual annotation of 3D bounding boxes in large-scale 3D scenes is expensive and time-consuming. This motivates the exploration of annotation-free 3D object detection using unlabeled point cloud data. Existing unsupervised 3D detection frameworks predominantly identify moving objects via scene flow, which has significant limitations: (1) limited detection classes ($<3$), (2) difficulty in detecting stationary objects, and (3) reliance on high frame rates. To address these limitations, we propose AnnofreeOD, a novel Annotation-free Object Detection framework based on 2D-to-3D knowledge distillation. First, we explore an effective strategy to generate high-quality pseudo boxes using single-frame 2D knowledge. Second, we observe the noise from the previous step and introduce Noise-Resistant Regression (NRR) based on Box Augmentation (BA). AnnofreeOD achieves state-of-the-art performance across multiple experiments. On the nuScenes dataset, we established the first annotation-free 10-class object detection baseline, achieving 40\% of fully supervised performance. Furthermore, in 3-class and class-agnostic object detection tasks, our approach surpasses prior state-of-the-art methods by +9.3\% mAP (+12.2\% NDS) and +6.0\% AP (+7.2\% NDS), significantly improving precision. The code and model weights are provided in the supplementary material.

Live content is unavailable. Log in and register to view live content