L Nandanwar
A new deep wavefront based model for text localization in 3D video
Nandanwar, L; Shivakumara, P; Ramachandra, R; Lu, T; Pal, U; Antonacopoulos, A; Lu, Y
Authors
P Shivakumara
R Ramachandra
T Lu
U Pal
Prof Apostolos Antonacopoulos A.Antonacopoulos@salford.ac.uk
Professor
Y Lu
Abstract
With the evolution of electronic devices, such as 3D cameras, addressing the challenges of text localization in 3D video (e.g., for indexing) is increasingly drawing the attention of the multimedia and video processing community. Existing methods focus on 2D video and their performance in the presence of the challenges in 3D video, such as shadow areas associated with text and irregularly sized and shaped text, degrades. This paper proposes the first approach that successfully addresses the challenges of 3D video in addition to those of 2D. It employs a number of innovations, among which, the first is the Generalized Gradient Vector Flow (GGVF) for dominant points detection. The second is the Wavefront concept for text candidate point detection from those dominant points. In addition, an Adaptive B-Spline Polygon Curve Network (ABS-Net) is proposed for accurate text localization in 3D videos by constructing tight fitting bounding polygons using text candidate points. Extensive experiments on custom (3D video) and standard datasets (2D video and scene text) show that the proposed method is practical and useful, and overall outperforms existing state-of-the-art methods.
Citation
Nandanwar, L., Shivakumara, P., Ramachandra, R., Lu, T., Pal, U., Antonacopoulos, A., & Lu, Y. (2021). A new deep wavefront based model for text localization in 3D video. IEEE Transactions on Circuits and Systems for Video Technology, 32(6), 3375-3389. https://doi.org/10.1109/TCSVT.2021.3110990
Journal Article Type | Article |
---|---|
Acceptance Date | Aug 24, 2021 |
Online Publication Date | Sep 7, 2021 |
Publication Date | Sep 7, 2021 |
Deposit Date | Oct 13, 2021 |
Publicly Available Date | Oct 13, 2021 |
Journal | IEEE Transactions on Circuits and Systems for Video Technology |
Print ISSN | 1051-8215 |
Electronic ISSN | 1558-2205 |
Publisher | Institute of Electrical and Electronics Engineers |
Volume | 32 |
Issue | 6 |
Pages | 3375-3389 |
DOI | https://doi.org/10.1109/TCSVT.2021.3110990 |
Publisher URL | https://doi.org/10.1109/TCSVT.2021.3110990 |
Related Public URLs | http://ieeexplore.ieee.org/xpl/RecentIssue.jsp/?punumber=76 |
Additional Information | Access Information : © 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. Funders : Natural Science Foundation of China;Ministry of Higher Education, Malaysia Grant Number: 61672273 Grant Number: FRGS grant (FP104-2020) |
Files
3D-Text-IEEE-T-CSVT_Author_Accepted_Manuscript.pdf
(6.6 Mb)
PDF
You might also like
A survey of OCR evaluation tools and metrics
(2021)
Conference Proceeding
VISE : an interface for Visual Search and Exploration of museum collections
(2019)
Journal Article
Efficient and effective OCR engine training
(2019)
Journal Article
Downloadable Citations
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search