Skip to yearly menu bar Skip to main content


Poster

Progressive Distribution Bridging: Unsupervised Adaptation for Large-scale Pre-trained Models via Adaptive Auxiliary Data

Weinan He · Yixin Zhang · Zilei Wang


Abstract:

Large-scale pre-trained Vision-Language Models (VLMs) like CLIP have demonstrated promising zero-shot transfer capabilities to downstream tasks. However, their performance deteriorates when facing significant domain shifts. In this paper, we focus on cost-effective adaptation of large-scale pre-trained VLMs to unlabeled target domains. In this context, two prevalent paradigms show inherent limitations: Unsupervised Fine-Tuning (UFT) struggles with poor initial model performance, while Unsupervised Domain Adaptation (UDA) may suffer from adverse effects of inappropriate auxiliary source domain. To alleviate these limitations, we propose to adaptively construct more suitable auxiliary data from large-scale image-text pairs to facilitate unsupervised adaptation without any human annotations. Specifically, we introduce Progressive Distribution Bridging (PDB), which decomposes the challenging adaptation task into multiple simple steps through the construction of auxiliary data. To obtain such data, we design an efficient and controllable retrieval algorithm incorporating cascaded semantic filters and style controller to regulate the semantic category and domain style of retrieved data, respectively. Experimental results across 11 different domains from three standard UDA benchmarks demonstrate the effectiveness of our auxiliary data. Notably, on Office-Home, our method outperforms state-of-the-art UDA methods that rely on labeled source domains. The proposed method offers a more universal and cost-effective solution for adapting VLMs to unlabeled downstream tasks.

Live content is unavailable. Log in and register to view live content