Tutorial
Foundation Models in Visual Anomaly Detection: Advances, Challenges, and Applications
Jiawen Zhu · Chengjie Wang · Guansong Pang
In recent years, foundation models have emerged as transformative tools in computer vision, offering powerful zero-shot and few-shot learning capabilities across a wide range of tasks. Their integration into visual anomaly detection—a critical and high-stakes field spanning healthcare, industrial inspection, security, and autonomous systems—has opened new frontiers in both research and real-world applications. This tutorial aims to deliver a comprehensive and timely overview of the role of foundation models in visual anomaly detection. We will cover multiple visual modalities, including 2D images, 3D images, and videos—each presenting unique challenges and necessitating modality-specific solutions. Specifically, we will delve into the entire pipeline, from data (pre-)training and prompt engineering to methodological innovations, inference strategies, and deployment in real-world environments. Key topics include zero- and few-shot learning, pseudo-labeling, anomaly generation, and multi-modal alignment between vision and language. To facilitate a deep and practical understanding of these areas, the tutorial will bring together leading experts from both academia and industry. Through in-depth technical presentations and discussions, participants will gain valuable insights into the latest advances, real-world applications, and open challenges shaping this rapidly evolving field.
Live content is unavailable. Log in and register to view live content