Poster
Borrowing Eyes for the Blind Spot: Overcoming Data Scarcity in Malicious Video Detection via Cross-Domain Retrieval Augmentation
Rongpei Hong · Jian Lang · Ting Zhong · Fan Zhou
The rapid proliferation of online video-sharing platforms has accelerated the spread of malicious videos, creating an urgent need for robust detection methods. However, the performance and generalizability of existing detection approaches are severely limited by the scarcity of annotated video data, as manually curating large-scale malicious detection datasets is both labor-intensive and impractical. To address this challenge, we propose CRAVE, a novel CRoss-domAin retrieVal augmEntation framework that transfers knowledge from resource-rich image-text domain to enhance malicious video detection. Specifically, CRAVE introduces a Pseudo-Pair Retriever to identify semantically relevant image-text data for high-quality cross-domain augmentation. Additionally, a Contrastive Cross-Domain Augmenter is designed to disentangle domain-shared and -unique representations, effectively bridging the domain gaps during knowledge transfer. These shared image-text representations are then leveraged to refine video representations, yielding more discriminative features for accurate malicious content detection. Experiments on four video datasets demonstrate that CRAVE largely outperforms competitive baselines in both performance and generalization, providing an innovative and strong solution to the issue of video data-scarcity.
Live content is unavailable. Log in and register to view live content