Skip to yearly menu bar Skip to main content


Poster

Partially Matching Submap Helps: Uncetainty Modeling and Propagation for Text to Point Cloud Localization

Mingtao Feng · Longlong Mei · Zijie Wu · Jianqiao Luo · Fenghao Tian · Jie Feng · Weisheng Dong · Yaonan Wang


Abstract:

Text to point cloud cross-modal localization is a crucial vision-language task for future human-robot collaboration. Existing coarse-to-fine frameworks assume that each query text precisely corresponds to the center area of a submap, limiting their applicability in real-world scenarios. This work redefines the task under a more realistic assumption, relaxing the one-to-one retrieval constraint by allowing patially matching query text and submap pairs. To address this challenge, we augment datasets with partially matching submaps and introduce an uncertainty-aware framework. Specifically, we model cross-modal ambiguity in fine-grained location regression by integrating uncertainty scores, represented as 2D Gaussian distributions, to mitigate the impact of challenging samples. Additionally, we propose an uncertainty-aware similarity metric that enhances similarity assessment between query text and submaps by propagating uncertainty into coarse place recognition, enabling the model to learn discriminative features, effectively handle partially matching samples and improve task synergy. Extensive experiments on KITTI360Pose and CityRefer demonstrate that our method achieves state-of-the-art performance across both stages. Our code will be publicly available.

Live content is unavailable. Log in and register to view live content