Tanmay Jain
Deformable scene text detection using harmonic features and modified pixel aggregation network
Jain, Tanmay; Palaiahnakote, Shivakumara; Pal, Umapada; Liu, Cheng-Lin
Authors
Dr Shivakumara Palaiahnakote S.Palaiahnakote@salford.ac.uk
Lecturer in Computer Vision
Umapada Pal
Cheng-Lin Liu
Abstract
Although text detection methods have addressed several challenges in the past, there is a dearth of effective methods for text detection in deformable images, such as images containing text embedded on cloth, banners, rubber, sports jerseys, uniforms, etc. This is because deformable regions contain surfaces of arbitrarily shapes, which lead to poor text quality. This paper presents a new method for deformable text detection in natural scene images. It is observed that although the shapes of characters change in a deformable region, the pixel values and spatial relationship between the pixels do not change. This motivated us to explore extraction of Maximally Stable Extremal Regions (MSER) in an image in which pixels that share common features are grouped into components. The unique character shape variations led us to explore harmonic features to represent the component shape variations, using which a classifier classifies text and non-text components from the output of the MSER step. Additionally, the objective of developing a lightweight method with low computational cost motivated us to introduce a modified Pixel Aggression Network (PAN) for text deformable text detection at a component level. Comprehensive experiments which include experiments on our Deformable Text Dataset (DTD) and standard natural scene text datasets, namely, MSRATD-500, ICDAR 2019 MLT, Total-Text, CTW1500, ICDAR 2019 ArT and DSTA1500 datasets show that the proposed model outperforms the existing methods for our dataset as well as the standard datasets.
Citation
Jain, T., Palaiahnakote, S., Pal, U., & Liu, C.-L. (2021). Deformable scene text detection using harmonic features and modified pixel aggregation network. Pattern Recognition Letters, 152, 135-142. https://doi.org/10.1016/j.patrec.2021.10.006
Journal Article Type | Article |
---|---|
Acceptance Date | Oct 6, 2021 |
Online Publication Date | Oct 8, 2021 |
Publication Date | 2021-12 |
Deposit Date | Nov 15, 2024 |
Journal | Pattern Recognition Letters |
Print ISSN | 0167-8655 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 152 |
Pages | 135-142 |
DOI | https://doi.org/10.1016/j.patrec.2021.10.006 |
You might also like
A Newly Adopted YOLOv9 Model for Detecting Mould Regions Inside of Buildings
(2024)
Journal Article
Spatial-Frequency Based EEG Features for Classification of Human Emotions
(2024)
Journal Article
Downloadable Citations
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search