Poster
Mitigating Object Hallucinations via Sentence-Level Early Intervention
Shangpin Peng · Senqiao Yang · Li Jiang · Zhuotao Tian
Multimodal large language models (MLLMs) have revolutionized cross-modal understanding but continue to struggle with hallucinations - fabricated content contradicting visual inputs. Existing hallucination mitigation methods either incur prohibitive computational costs or introduce distribution mismatches between training data and model outputs. We identify a critical insight: hallucinations predominantly emerge at the early stages of text generation and propagate through subsequent outputs. To address this, we propose SENTINEL (Sentence-level Early iNtervention Through IN-domain prEference Learning), a framework that eliminates dependency on human annotations. Specifically, we first bootstrap high-quality in-domain preference pairs by iteratively sampling model outputs, validating object existence through cross-checking with two open-vocabulary detectors, and classifying sentences into hallucinated/non-hallucinated categories. Subsequently, we use context-coherent positive samples and hallucinated negative samples to iteratively build context-aware preference data. Finally, we train models using a context-aware preference loss (C-DPO) that emphasizes discriminative learning at the sentence level where hallucinations initially manifest. Experimental results show that SENTINEL can reduce hallucinations by 90\% over the original model and outperforms the previous state-of-the-art method on both the hallucination benchmarks and general capabilities benchmarks, manifesting its superiority and generalization ability. The proposed models, datasets and code will be made publicly available.
Live content is unavailable. Log in and register to view live content