Poster
Allowing Oscillation Quantization: Overcoming Solution Space Limitation in Low Bit-Width Quantization
Weiying Xie · Zihan Meng · Jitao Ma · Wenjin Guo · Haowei Li · Haonan Qin · Leyuan Fang · Yunsong Li
[
Abstract
]
Abstract:
Quantization-aware Training (QAT) technology helps deep models adapt to precision loss by simulating quantization operations. However, existing methods fail to reach the optimal solution due to inadequate exploration of quantization solution space. To address the issue, we propose a novel QAT method, Allowing Oscillation Quantization (AOQ), which expands the reachable solution space through weight oscillation. Notably, unlike previous methods that suppress oscillation throughout training, in the early and middle training stages, AOQ promotes oscillation to explore a broader range of quantized configurations. In the later stage, AOQ suppresses oscillation to ensure stable convergence. Furthermore, by decoupling the quantization thresholds and levels, we encourage meaningful oscillation and improve the stability of learnable quantization parameters. Extensive experiments on various models, including ResNet, MobileNet, DeiT and Swin Transformer, demonstrate the effectiveness of our method. Specifically, with 2-bit quantization, AOQ achieves a performance improvement of $0.4$%$\sim$$2.2$% on ImageNet compared to state-of-the-art methods.
Live content is unavailable. Log in and register to view live content