Poster
WINS: Winograd Structured Pruning for Fast Winograd Convolution
Cheonjun Park · Hyunjae Oh · Mincheol Park · Hyunchan Moon · Minsik Kim · Suhyun Kim · Myung Kuk Yoon · Won Woo Ro
Recent GPUs leverage Winograd convolution and structured pruning to significantly accelerate inference.First, Winograd convolution is theoretically 2.25× faster than standard convolution.Second, structured pruning reduces inference time without additional overhead as the pruning ratio increases.However, applying conventional structured pruning alongside Winograd convolution is inefficient. Existing structured pruning methods, which do not account for how GPUs process Winograd convolution, require large pruning unit sizes, leading to significant information loss.In this paper, we propose Winograd Structured Pruning (WINS), \textbf{the first approach} to employ optimized structured pruning for Winograd convolution. WINS is designed based on an in-depth analysis of Winograd convolution's computational characteristics on GPUs.Additionally, we introduce two variants, WINS-B and WINS-AB, which further enhance performance. Experimental results show that WINS-AB achieves up to 2.8× practical speedup in Winograd convolution inference on GPUs while preserving the accuracy of ResNet-18 on ImageNet.
Live content is unavailable. Log in and register to view live content