Yirui Wu
A Robust Symmetry-Based Method for Scene/Video Text Detection through Neural Network
Wu, Yirui; Wang, Wenhai; Palaiahnakote, Shivakumara; Lu, Tong
Authors
Wenhai Wang
Dr Shivakumara Palaiahnakote S.Palaiahnakote@salford.ac.uk
Lecturer in Computer Vision
Tong Lu
Abstract
Text detection in video/scene images has gained a significant attention in the field of image processing and document analysis due to the inherent challenges caused by variations in contrast, orientation, background, text type, font type, non-uniform illumination and so on. In this paper, we propose a novel text detection method to explore symmetry property and appearance features of text for improved accuracy and robustness. First, the proposed method explores Extremal Regions (ER) for detecting text candidates in images. Then we propose a novel feature named as Multi-domain Strokes Symmetry Histogram (MSSH) for each text candidate, which describes the inherent symmetry property of stroke pixel pairs in gray, gradient and frequency domains. Furthermore, deep convolutional features are extracted to describe the appearance for each text candidate. We further fuse them by Auto-Encoder network to define a more discriminative text descriptor for classification. Finally, the proposed method constructs text lines based on the classification results. We demonstrate the effectiveness and robustness detection results of our proposed method by testing on four different benchmark databases.
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) |
Start Date | Nov 9, 2017 |
End Date | Nov 15, 2017 |
Online Publication Date | Jan 29, 2018 |
Publication Date | Jan 29, 2018 |
Deposit Date | Nov 15, 2024 |
Publisher | Institute of Electrical and Electronics Engineers |
Series ISSN | 2379-2140 |
Book Title | 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR) |
ISBN | 9781538635872 |
DOI | https://doi.org/10.1109/ICDAR.2017.206 |
You might also like
A Newly Adopted YOLOv9 Model for Detecting Mould Regions Inside of Buildings
(2024)
Journal Article
Spatial-Frequency Based EEG Features for Classification of Human Emotions
(2024)
Journal Article
Downloadable Citations
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search