ICCV Poster ESSENTIAL: Episodic and Semantic Memory Integration for Video Class-Incremental Learning

Poster

ESSENTIAL: Episodic and Semantic Memory Integration for Video Class-Incremental Learning

Jongseo Lee · Kyungho Bae · Kyle Min · Gyeong-Moon Park · Jinwoo Choi

Exhibit Hall I #1620

Highlight

[ Abstract ] [ Project Page ]

Wed 22 Oct 5:45 p.m. PDT — 7:45 p.m. PDT

Abstract:

In this work, we tackle the problem of video class-incremental learning (VCIL). Many existing VCIL methods mitigate catastrophic forgetting by rehearsal training with a few temporally dense samples stored in episodic memory, which is memory-inefficient. Alternatively, some methods store temporally sparse samples, sacrificing essential temporal information and thereby resulting in inferior performance.To address this trade-off between memory-efficiency and performance, we propose EpiSodic and SEmaNTIc memory integrAtion for video class-incremental Learning (ESSENTIAL).We are inspired by the human memory system, which integrates episodic and semantic memory for accurate information retrieval.ESSENTIAL consists of episodic memory for storing temporally sparse features and semantic memory for storing general knowledge represented by learnable prompts.We introduce a novel memory retrieval (MR) module that integrates episodic and semantic memory through cross-attention, enabling the retrieval of temporally dense features from temporally sparse features.We rigorously validate ESSENTIAL on diverse datasets: UCF-101, HMDB51, and Something-Something-V2 from the TCD benchmark and UCF-101, ActivityNet, and Kinetics-400 from the vCLIMB benchmark.Remarkably, with significantly reduced memory, ESSENTIAL achieves favorable performance on the benchmarks.

Live content is unavailable. Log in and register to view live content