Poster
ForCenNet: Foreground-Centric Network for Document Image Rectification
Peng Cai · liqiang liqiang · Kaicheng Yang · guodong guodong · lijia lijia · zhounan zhounan · Xiang An · Ninghua Yang · Jiankang Deng
Document image rectification aims to eliminate geometric deformation in photographed documents to facilitate text recognition. However, existing methods often neglect the significance of foreground elements, which provide essential geometric references and layout information for document image correction. In this paper, we introduce \textbf{For}eground-\textbf{Cen}tric \textbf{Net}work~(\textbf{ForCenNet}) to eliminate geometric distortions in document images. Specifically, we initially propose a foreground-centric label generation method, which extracts detailed foreground elements from an undistorted image. Then we introduce a foreground-centric mask mechanism to enhance the distinction between readable and background regions. Furthermore, we design a curvature consistency loss to leverage the detailed foreground labels to help the model understand the distorted geometric distribution. Extensive experiments demonstrate that ForCenNet achieves new state-of-the-art on four real-world benchmarks, such as DocUNet, DIR300, WarpDoc, and DocReal. Quantitative analysis shows that the proposed method effectively undistorts layout elements, such as text lines and table borders. Our training code and pre-trained models will be released to facilitate future research.
Live content is unavailable. Log in and register to view live content