Skip to yearly menu bar Skip to main content


Poster

MMGeo: Multimodal Compositional Geo-Localization for UAVs

Yuxiang Ji · Boyong He · Zhuoyue Tan · Liaoni Wu


Abstract:

Multimodal geo-localization methods can inherently overcome the limitations of unimodal sensor systems by leveraging complementary information from different modalities.However, existing retrieval-based methods rely on a comprehensive multimodal database, which is often challenging to fulfill in practice.In this paper, we introduce a more practical problem for localizing drone-view images by collaborating multimodal data within a satellite-view reference map, which integrates multimodal information while avoiding the need for an extensive multimodal database.We present \textsc{MMGeo} that learns to push the composition of multimodal representations to the target reference map through a unified framework.By utilizing a comprehensive multimodal query (image, point cloud/depth/text), we can achieve more robust and accurate geo-localization, especially in unknown and complex environments.Additionally, we extend two visual geo-localization datasets GTA-UAV and UAV-VisLoc to multi-modality, establishing the first UAV geo-localization datasets that combine image, point cloud, depth and text data.Experiments demonstrate the effectiveness of \textsc{MMGeo} for UAV multimodal compositional geo-localization, as well as the generalization capabilities to real-world scenarios.

Live content is unavailable. Log in and register to view live content