Poster
Multi-Schema Proximity Network for Composed Image Retrieval
Jiangming Shi · Xiangbo Yin · yeyunchen yeyunchen · Yachao Zhang · zhizhong zhang · Yuan Xie · Yanyun Qu
Composed Image Retrieval (CIR) aims to retrieve a target image using a query that combines a reference image and a textual description, benefiting users to express their intent more effectively. Despite significant advances in CIR methods, two unresolved problems remain: 1) existing methods overlook multi-schema interaction due to the lack of fine-grained explicit visual supervision, which hinders the capture of complex correspondences, and 2) existing methods overlook noisy negative pairs formed by potential corresponding query-target pairs, which increases confusion. To address these problems, we propose a Multi-schemA Proximity Network (MAPNet) for CIR, consisting of two key components: Multi-Schema Interaction (MSI) and Relaxed Proximity Loss (RPLoss). Specifically, MSI leverages textual descriptions as an implicit guide to establish correspondences between multiple objects and attributes in the reference and target images, enabling multi-schema interactions. Then, RPLoss further aligns the query and target features while avoiding the poison of noisy negative pairs by denoising and reweighting strategy. Comprehensive experiments conducted on CIRR, FashionIQ, and LaSCo demonstrate that MAPNet achieves competitive results against state-of-the-art CIR methods. The source code will be made publicly available after the paper is accepted.
Live content is unavailable. Log in and register to view live content