Skip to yearly menu bar Skip to main content


Poster

Weakly Supervised Dynamic Scene Graph Generation with Temporal-enhanced In-domain Knowledge Transferring

Zhu Xu · Ting Lei · Zhimin Li · Guan Wang · Qingchao Chen · Yuxin Peng · Yang Liu


Abstract:

Dynamic Scene Graph Generation (DSGG) aims to create a scene graph for each video frame by detecting objects and predicting their relationships. Weakly Supervised DSGG (WS-DSGG) reduces annotation workload by using an unlocalized scene graph from a single frame per video for training. Existing WS-DSGG methods depend on an off-the-shelf external object detector to generate pseudo labels for subsequent DSGG training. However, detectors trained on static, object-centric images struggle in dynamic, relation-aware scenarios required for DSGG, leading to inaccurate localization and low-confidence proposals. To address the challenges posed by external object detectors in WS-DSGG, we propose a Temporal-enhanced In-domain Knowledge Transferring (TIKT) method, which leverages in-domain knowledge to enhance detection in relation-aware dynamic scenarios. TIKT is built on two key components: (1)In-domain knowledge mining: we first employ object and relation class decoders that generate category-specific attention maps to highlight both object regions and interactive areas, facilitating attention maps relation-aware. Then we propose an Inter-frame Attention Augmentation strategy that exploits neighboring frames and optical flow information to enhance these attention maps, making them motion-aware and robust to motion blur. This step yields relation- and motion-aware in-domain knowledge mining for WS-DSGG. (2) we introduce a Dual-stream Fusion Module that integrates category-specific attention maps into external detections to refine object localization and boost confidence scores for object proposals. Extensive experiments demonstrate that TIKT significantly improves detection performance, providing more accurate and confident pseudo labels for WS-DSGG training.

Live content is unavailable. Log in and register to view live content