ICCV Poster DeFSS: Image-to-Mask Denoising Learning for Few-shot Segmentation

Poster

DeFSS: Image-to-Mask Denoising Learning for Few-shot Segmentation

Zishu Qin · Junhao Xu · Weifeng Ge

[ Abstract ]

Abstract:

Deep learning algorithms are highly data-intensive, particularly for tasks requiring pixel-level annotations, such as semantic segmentation, which makes achieving pixel-level image understanding costly. Few-shot segmentation seeks to address this challenge by enabling models to segment novel objects using only a limited number of labeled support images as references. In this paper, we argue that the traditional image-to-mask decoding framework places excessive reliance on the quality of the support sample, which is prone to errors when encountering class bias. Thus, we propose a novel image-to-mask denoising learning paradigm for few-shot segmentation, transforming mask decoding into a denoising process to reduce the support reliance problem with the help of denoising diffusion models. We formulate our image-to-mask denoising learning process in two stages: an image corruption stage and a mask denoising stage. In the first stage, we introduce an adaptive image corruption method that perturbs the image based on regional semantics, motivated by the insight of perturbing data to populate low data density regions. In the second stage, we employ an in-model denoising paradigm, designing a network to facilitate support-to-query semantic propagation and mask denoising in a single forward pass. To enhance categorical discrimination for the denoising network, we incorporate discriminative attribute learning, which leverages base classes to train the model in distinguishing object categories and generalizing to novel classes. Extensive experiments and ablation studies validate the effectiveness of our approach, demonstrating that the proposed method achieves competitive performance across various benchmarks.

Live content is unavailable. Log in and register to view live content