Poster
Backdoor Defense via Enhanced Splitting and Trap Isolation
Hongrui Yu · Lu Qi · Wanyu Lin · Jian Chen · Hailong Sun · chengbin sun
Backdoor attacks pose a significant threat to deep neural networks (DNNs), as attackers can inject a backdoor by tampering with only a few samples. The variety of backdoor attacks makes comprehensive defense extremely challenging. Previous defenses typically assume that backdoor samples are out-of-distribution (OOD) data of benign samples. However, backdoor samples can also be in-distribution (ID) data of benign samples and hard to identify as outliers, potentially causing defenses to fail. To address this issue, we propose a two-stage backdoor defense based on Enhanced Splitting and Trap Isolation (ESTI), leveraging attackers' tampering to defend against their attacks. In the first stage, we introduce backdoored models in conjunction with a benign model to split the dataset into a reliable clean subset and a poisoned subset. In the second stage, we introduce a trap mechanism to isolate the poisoned subset into a trap class to train a trap-model. The trap-model can flip the predictions of poisoned samples from the attacker's target class to the trap class. Through extensive experiments on three benchmark datasets and five model architectures, we demonstrate that ESTI effectively defends against various backdoor attacks while maintaining model performance on benign data, proving the superiority of our approach. Our code is available in the supplementary material.
Live content is unavailable. Log in and register to view live content