Palaiahnakote Shivakumara
A new deep CNN for 3D text localization in the wild through shadow removal
Shivakumara, Palaiahnakote; Banerjee, Ayan; Nandanwar, Lokesh; Pal, Umapada; Antonacopoulos, Apostolos; Lu, Tong; Blumenstein, Michael
Authors
Ayan Banerjee
Lokesh Nandanwar
Umapada Pal
Prof Apostolos Antonacopoulos A.Antonacopoulos@salford.ac.uk
Professor
Tong Lu
Michael Blumenstein
Abstract
Text localization in the wild is challenging due to the presence of 2D and 3D texts, the presence of shadows, arbitrary orientated text with non-linear arrangements, varying lighting conditions as well as complex background. This paper proposes the first approach for 3D text localization in natural scene images through shadow removal and a new deep CNN model. In a first step, exploiting the observation that 3D text generates shadow information in natural scenes, the proposed model detects and removes the shadow pixels of 3D text based on the Generalized Gradient Vector Flow concept and a new clustering approach. The performance of the classification of 2D and 3D texts in the scene images is strengthened by using key features, including pixel strength, sharpness and edge potential, which are extracted to eliminate false text and shadow pixels. For text localization after removing shadow information, EfficientNet is used as an encoder (backbone) and UNet as a decoder in a novel way employing differential binarization. Experimental validation and comparative analysis with state-of-the-art approaches on both a new purpose-built dataset as well as on the benchmark datasets of ICDAR MLT 2019, ICDAR ArT 2019, CTW1500, DAST1500, Total-Text, and MSRATD500 for each of the different steps of the method, show that the proposed approach outperforms the existing methods.
Citation
Shivakumara, P., Banerjee, A., Nandanwar, L., Pal, U., Antonacopoulos, A., Lu, T., & Blumenstein, M. (2024). A new deep CNN for 3D text localization in the wild through shadow removal. Computer Vision and Image Understanding, 238, https://doi.org/10.1016/j.cviu.2023.103863
Journal Article Type | Article |
---|---|
Acceptance Date | Oct 13, 2023 |
Online Publication Date | Oct 18, 2023 |
Publication Date | 2024-01 |
Deposit Date | Dec 15, 2023 |
Publicly Available Date | Oct 19, 2025 |
Journal | Computer Vision and Image Understanding |
Print ISSN | 1077-3142 |
Publisher | Elsevier |
Peer Reviewed | Peer Reviewed |
Volume | 238 |
DOI | https://doi.org/10.1016/j.cviu.2023.103863 |
Keywords | Text localization, Gradient vector flow, Shadow removal, 3D text classification, 3D text localization |
Files
This file is under embargo until Oct 19, 2025 due to copyright reasons.
Contact A.Antonacopoulos@salford.ac.uk to request a copy for personal use.
You might also like
A survey of OCR evaluation tools and metrics
(2021)
Conference Proceeding
VISE : an interface for Visual Search and Exploration of museum collections
(2019)
Journal Article
Efficient and effective OCR engine training
(2019)
Journal Article
Downloadable Citations
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search