Skip to yearly menu bar Skip to main content


Poster

Unified Category-Level Object Detection and Pose Estimation from RGB Images using 3D Prototypes

Tom Fischer · Xiaojie Zhang · Eddy Ilg


Abstract:

Recognizing objects in images is a fundamental problem in computer vision. While detecting objects in 2D images is common, many applications require determining their pose in 3D space. Traditional category-level methods rely on RGB-D inputs, which may not always be available, or employ two-stage approaches that use separate models and representations for detection and pose estimation. For the first time, we introduce a unified model that integrates detection and pose estimation into a single framework for RGB images by leveraging neural mesh models with learned features and multi-model RANSAC. Our approach achieves state-of-the-art results on RGB category-level pose on REAL275, outperforming the current state-of-the-art by 5.5\%, averaged across all scale-agnostic metrics. Finally, we demonstrate that our unified method exhibits significantly greater robustness compared to single-stage baselines.

Live content is unavailable. Log in and register to view live content