ICCV Poster G2D: Boosting Multimodal Learning with Gradient-Guided Distillation

Poster

G2D: Boosting Multimodal Learning with Gradient-Guided Distillation

Mohammed Rakib · Arunkumar Bagavathi

#378

[ Abstract ] [ Project Page ]

Tue 21 Oct 2:45 p.m. PDT — 4:45 p.m. PDT

Abstract: Multimodal learning aims to leverage information from diverse data modalities to achieve more comprehensive performance. However, conventional multimodal models often suffer from modality imbalance, where one or a few modalities dominate model optimization, leading to suboptimal feature representation and underutilization of weak modalities. To address this challenge, we introduce Gradient-Guided Distillation ($G^{2}D$), a knowledge distillation framework that optimizes the multimodal model with a custom-built loss function that fuses both unimodal and multimodal objectives. $G^{2}D$ further incorporates a dynamic sequential modality prioritization (SMP) technique in the learning process to ensure each modality leads the learning process, avoiding the pitfall of stronger modalities overshadowing weaker ones. We validate $G^{2}D$ on multiple real-world datasets and show that $G^{2}D$ amplifies the significance of weak modalities while training and outperforms state-of-the-art methods in classification and regression tasks. The source code is available with the supplementary materials.

Live content is unavailable. Log in and register to view live content