A Multi-Oriented Scene Text Detector with Position-Sensitive Segmentation
Scene text detection has been studied for a long time and lots of approaches have achieved promising performances. Most approaches regard text as a specific object and utilize the popular frameworks of object detection to detect scene text. However, scene text is different from general objects in terms of orientations, sizes and aspect ratios. In this paper, we present an end-to-end multi-oriented scene text detection approach, which combines the object detection framework with the position-sensitive segmentation. For a given image, features are extracted through a fully convolutional network. Then they are input into text detection branch and position-sensitive segmentation branch simultaneously, where text detection branch is used for generating candidates and position-sensitive segmentation branch is used for generating segmentation maps. Finally the candidates generated by text detection branch are projected onto the position-sensitive segmentation maps for filtering. The proposed approach utilizes the merits of position-sensitive segmentation to improve the expressiveness of the proposed network. Additionally, the approach uses position-sensitive segmentation maps to further filter the candidates so as to highly improve the precision rate. Experiments on datasets ICDAR2015 and COCO-Text demonstrate that the proposed method outperforms previous state-of-the-art methods. For ICDAR2015 dataset, the proposed method achieves an F-score of 0.83 and a precision rate of 0.87.
Code
未发现
Tasks
scene text detection
Datasets
ICDAR2015, COCO-Text
Problems
improving scene text detection accuracy by addressing orientation, size, and aspect ratio differences
Methods
end-to-end multi-oriented scene text detection, object detection framework, position-sensitive segmentation, fully convolutional network
Results from the Paper
F-score of 0.83 and precision rate of 0.87 on ICDAR2015, outperforms state-of-the-art methods