Poster
Multi-scenario Overlapping Text Segmentation with Depth Awareness
Yang Liu · Xudong Xie · Yuliang Liu · Xiang Bai
Overlapping text poses significant challenges for text-related perception tasks, particularly in open scenes characterized by diverse fonts and visual effects. While existing research has primarily addressed the overlapping problem in documents, its applicability to other scenes remains limited. To bridge this gap, we propose a new task of multi-scenario overlapping text segmentation and introduce a corresponding real dataset in both English and Chinese, spanning various contexts such as printed text, bills, artistic designs, and house numbers. To further enhance the generalization of overlapping text segmentation models, we propose a hierarchical training data synthesis strategy that simulates diverse overlapping patterns across different scenarios. Furthermore, we found that depth maps can provide clear relative position relationships in three-dimensional space, assisting the model in capturing complex overlapping relationships between text instances. Building on this insight, we present a depth-guided decoder that seamlessly integrates image and depth features to capture overlapping interactions. Our proposed model achieves a 5.3% improvement in text mIoU and a 6.4% improvement in overall mIoU compared to existing SOTA methods on our benchmark and SignaTR6k datasets, respectively.
Live content is unavailable. Log in and register to view live content