ICCV Poster Training-Free Class Purification for Open-Vocabulary Semantic Segmentation

Poster

Training-Free Class Purification for Open-Vocabulary Semantic Segmentation

Qi Chen · Lingxiao Yang · Yun Chen · Nailong Zhao · Jianhuang Lai · Jie Shao · Xiaohua Xie

[ Abstract ]

Abstract:

Fine-tuning pre-trained vision-language models has proven effective in enhancing open-vocabulary semantic segmentation (OVSS). However, given the significant resource consumption required for training on large datasets, there is growing interest in exploring training-free methods for OVSS. Current training-free methods primarily focus on modifying model architectures and generating prototypes to improve segmentation performance, often overlooking issues of category redundancy and ambiguity. In this paper, we identify two key phenomena in OVSS: class redundancy and vision-language ambiguity in class activation maps and the affinity-refined activation maps. Inspired by our observations, we propose a training-free class purification framework -- FreeCP to purify semantic categories and address errors caused by these two issues. Specifically, we first generate class activation maps along with their refined activation maps using CLIP. These activations and their refined counterparts, are then organized by their associated categories to adaptively construct category relations, i.e., per category relations, and cross-category relations. We then effectively perform redundancy purification to eliminate classes, which are not present in the current image. Furthermore, we propose ambiguity purification to distinguish the correct class from their semantic similarity ones. The purified classes are subsequently used to produce the final segmentation prediction. Extensive experiments across eight benchmarks demonstrate that FreeCP, as a plug-and-play module, obtains significant performance gains combined with other OVSS methods. Our code will be made publicly available.

Live content is unavailable. Log in and register to view live content