Heatmap Regression without Soft-Argmax for Facial Landmark Detection
Chiao-An Yang ⋅ Raymond A. Yeh
2025 Poster
Abstract
Facial landmark detection is an important task in computer vision with numerous downstream applications, such as head pose estimation, expression analysis, face swapping, etc. Heatmap regression-based methods have been a strong contender in achieving state-of-the-art results in this task. These methods involve computing the argmax over the heatmaps to predict a landmark. As argmax is not differentiable, to enable end-to-end training on deep-nets, these methods rely on a differentiable approximation of argmax, namely Soft-argmax. In this work, we revisit this long-standing choice of using Soft-argmax and find that it may not be necessary. Instead, we propose an alternative training objective based on the classic structured prediction framework. Empirically, our method achieves state-of-the-art performance on three facial landmark benchmarks (WFLW, COFW, and 300W) with faster training convergence by roughly $2.2\times$ while maintaining intuitive design choices in our model.
Chat is not available.
Successful Page Load