Poster
Scaling 3D Compositional Models for Robust Classification and Pose Estimation
Xiaoding Yuan · Prakhar Kaushik · Guofeng Zhang · Artur Jesslen · Adam Kortylewski · Alan Yuille
Deep learning algorithms for object classification and 3D object pose estimation lack robustness to out-of-distribution factors such as synthetic stimuli, changes in weather conditions, and partial occlusion. Recently, a class of Neural Mesh Models have been developed where objects are represented in terms of 3D meshes with learned features at the vertices. These models have shown robustness in small-scale settings, involving 10 objects, but it is unclear that they can be scaled up to 100s of object classes. The main problem is that their training involves contrastive learning among the vertices of all object classes, which scales quadratically with the number of classes. We present a strategy which exploits the compositionality of the objects, i.e. the independence of the feature vectors of the vertices, which greatly reduces the training time while also improving the performance of the algorithms. We first restructure the per-vertex contrastive learning into contrasting within class and between classes. Then we propose a process that dynamically decouples the contrast between classes which are rarely confused, and enhances the contrast between the vertices of classes that are most confused. Our large-scale 3D compositional model not only achieves state-of-the-art performance on the task of predicting classification and pose estimation simultaneously, surpassing the Neural Mesh Models and standard DNNs, but is also more robust to out-of-distribution testing including occlusion, weather conditions, synthetic data, and generalization to unknown classes.
Live content is unavailable. Log in and register to view live content