ICCV Poster Backdoor Attacks on Neural Networks via One-Bit Flip

Poster

Backdoor Attacks on Neural Networks via One-Bit Flip

Xiang Li · Lannan Luo · Qiang Zeng

[ Abstract ]

Abstract:

Conventional backdoor attacks on deep neural networks (DNNs) typically assume that an attacker can manipulate the training data or process. However, recent research introduces a more practical threat model by injecting backdoors at the inference stage. These approaches leverage bit flip attacks to modify model weights using memory fault injection techniques such as Rowhammer. Despite their effectiveness, they suffer from a significant limitation---the need to flip a relatively large number of bits simultaneously, which is highly challenging in practice. To overcome this constraint, we propose SOLEFLIP, the first one-bit-flip backdoor attack. Unlike prior methods that rely on optimization-based bit searches and require flipping multiple bits, our algorithm identifies a promising weight for the attack and flips a single bit to insert a backdoor. We evaluate SOLEFLIP on CIFAR-10, SVHN, and ImageNet across various DNN architectures, including a vision transformer. The results show that SOLEFLIP achieves high attack success rates (up to 99.9\%, with an average of 98.9\%) while causing minimal degradation to benign accuracy (0.0\% on average). Furthermore, SOLEFLIP is resilient to backdoor defenses. Our findings reveal a critical threat to DNNs: flipping just one bit is sufficient to execute a successful backdoor attack.

Live content is unavailable. Log in and register to view live content