Poster
ReMP-AD: Retrieval-enhanced Multi-modal Prompt Fusion for Few-Shot Industrial Visual Anomaly Detection
Hongchi Ma · Guanglei Yang · Debin Zhao · Yanli Ji · Wangmeng Zuo
Industrial visual inspection is crucial for detecting defects in manufactured products, but it traditionally relies on human operators, leading to inefficiencies. Industrial Visual Anomaly Detection (IVAD) has emerged as a promising solution, with methods such as zero-shot, few-shot, and reconstruction-based techniques. However, zero-shot methods struggle with subtle anomalies, and reconstruction-based methods fail to capture fine-grained details. Few-shot methods, which use limited samples and prompts, offer a more efficient approach. Despite their promise, challenges remain in managing intra-class variation among references and in effectively extracting more representative anomaly features.This paper presents \textbf{R}etrieval-\textbf{e}nhanced \textbf{M}ulti-modal \textbf{P}rompt Fusion \textbf{A}nomaly \textbf{D}etection (ReMP-AD), a framework that introduces Intra-Class Token Retrieval (ICTR) to reduce noise in the memory bank and Vision-Language Prior Fusion (VLPF) to guide the encoder in capturing more distinctive and relevant features of anomalies. Experiments on the VisA and MVTec-AD datasets demonstrate that ReMP-AD outperforms existing methods, achieving 97.8\%/94.1\% performance in 4-shot anomaly segmentation and classification. Our approach also shows strong results on the PCB-Bank dataset, highlighting its effectiveness in few-shot industrial anomaly detection.
Live content is unavailable. Log in and register to view live content