ICCV Poster Hallucinatory Image Tokens: A Training-free EAZY Approach to Detecting and Mitigating Object Hallucinations in LVLMs

Poster

Hallucinatory Image Tokens: A Training-free EAZY Approach to Detecting and Mitigating Object Hallucinations in LVLMs

Liwei Che · Qingze Liu · Jing Jia · Weiyi Qin · Ruixiang Tang · Vladimir Pavlovic

[ Abstract ]

Abstract:

Despite their remarkable potential, Large Vision-Language Models (LVLMs) still face challenges with object hallucination, a problem where their generated outputs mistakenly incorporate objects that do not actually exist. Although most works focus on addressing this issue within the language-model backbone, our work shifts the focus to the image input source, investigating how specific image tokens contribute to hallucinations. Our analysis reveals that a small subset of image tokens with high attention scores are the main drivers of object hallucination. By removing these hallucinatory image tokens (only 1.5% of all image tokens), the issue can be effectively mitigated. This finding holds consistently across different models. Building on this insight, we introduce \eazy, a novel, training-free method that automatically identifies and Eliminates hAllucinations by Zeroing out hallucinator Y image tokens. We utilize EAZY for unsupervised object hallucination detection, achieving a 15% improvement compared to previous methods. Additionally, EAZY demonstrates remarkable effectiveness in mitigating hallucinations while preserving model utility and seamlessly adapting to various LVLM architectures.

Live content is unavailable. Log in and register to view live content