ICCV Poster Ask and Remember: A Questions-Only Replay Strategy for Continual Visual Question Answering

Poster

Ask and Remember: A Questions-Only Replay Strategy for Continual Visual Question Answering

Imad Eddine MAROUF · Enzo Tartaglione · Stéphane Lathuilière · Joost van de Weijer

Exhibit Hall I #1672

[ Abstract ]

Wed 22 Oct 5:45 p.m. PDT — 7:45 p.m. PDT

Abstract:

Continual Learning in Visual Question Answering (VQACL) requires models to acquire new visual-linguistic skills (plasticity) while preserving previously learned knowledge (stability). The inherent multimodality of VQACL exacerbates this challenge, as models must balance stability across both visual and textual domains while adapting to novel objects and reasoning tasks. Existing methods, primarily designed for unimodal settings, often fall short in addressing this dual requirement. In this work, we present QUestion-only replay with Attention Distillation (QUAD), a novel approach for VQACL that leverages only past task questions for regularization. By eliminating the need to store visual data, QUAD not only reduces memory overhead, but also alleviates privacy concerns. Our method introduces a Question-only Replay mechanism that selectively reuses prior task questions to counteract overfitting to the current task’s answer space, addressing the out-of-answer-set problem. Complementing this, we propose Attention Consistency Distillation to enforce both intra-modal and inter-modal attention consistency across tasks, preserving essential visual-linguistic associations. Extensive experiments on VQAv2 and NExT-QA demonstrate that QUAD significantly outperforms state-of-the-art methods, achieving robust performance in continual VQA. The source code, provided in the supplementary material, will be publicly released upon acceptance.

Live content is unavailable. Log in and register to view live content