ICCV Poster Semantic Causality-Aware Vision-Based 3D Occupancy Prediction

Poster

Semantic Causality-Aware Vision-Based 3D Occupancy Prediction

Dubing Chen · Huan Zheng · Yucheng Zhou · Xianfei Li · Wenlong Liao · Tao He · Pai Peng · Jianbing Shen

[ Abstract ]

Abstract:

Vision-based 3D semantic occupancy prediction is essential for autonomous systems, converting 2D camera data into 3D semantic grids. Current methods struggle to align 2D evidence with 3D predictions, undermining reliability and interpretability. This limitation drives a new exploration of the task’s causal foundations. We propose a novel approach that leverages causal principles to enhance semantic consistency in 2D-to-3D geometric transformation. Our framework introduces a causal loss that backpropagates 3D class features to 2D space for semantic alignment, ensuring 3D locations accurately reflect corresponding 2D regions. Building on this, we develop a Semantic Causality-Aware Lifting (SCA Lifting) method with three components, all guided by our causal loss: Channel-Grouped Lifting to adaptively map distinct semantics to appropriate positions, Learnable Camera Parameters to enhance camera perturbation robustness, and Normalized Convolution to propagate features to sparse regions. The evaluations demonstrate substantial gains in accuracy and robustness, positioning our method as a versatile solution for advancing 3D vision. Experimental results demonstrate that our approach significantly improves robustness to camera perturbations, enhances the semantic causal consistency in 2D-to-3D transformations, and yields substantial accuracy gains on the Occ3D dataset.

Live content is unavailable. Log in and register to view live content