Skip to yearly menu bar Skip to main content


Poster

GlassWizard: Harvesting Diffusion Priors for Glass Surface Detection

Wenxue Li · Tian Ye · Xinyu Xiong · Jinbin Bai · feilong tang · Wenxuan Song · Zhaohu Xing · Lie Ju · Guanbin Li · Lei Zhu


Abstract:

Glass Surface Detection (GSD) is a critical task in computer vision, enabling precise interactions with transparent surfaces and enhancing both safety and object recognition accuracy. However, current research still faces challenges in both recognition performance and generalization capability. Thanks to the recent advanced diffusion-based generative models, GSD task can benefit from rich prior knowledge encapsulated in pre-trained Stable Diffusion (SD) model. Thus, in this paper, we present GlassWizard, aiming to harvest priors in diffusion-based model to achieve accurate and generalized GSD. Firstly, we delve into the text embedding space in SD to build an text-based context prior, thereby enhancing the understanding of implicit attribute of glass and achieving fine-grained predictions. Secondly, we train an end-to-end diffusion model with a one-step formulation pipeline, yielding effective optimization and fast inference. In addition, to facilitate our adapted framework scalable to other multi-modal GSD tasks (such as RGB-D/RGB-T GSD), we present a modality-customized adaptation, that can achieve rapid adaptation to multi-modal GSD tasks. Our experimental results demonstrate that our proposed framework achieves cutting-edge performance across diverse datasets, and it also shows strong generalization ability. Additionally, it excels in multi-modal GSD tasks, confirming its scalability across different modalities. The code will be publicly released.

Live content is unavailable. Log in and register to view live content