Ayan Banerjee
TTS: Hilbert Transform-based Generative Adversarial Network for Tattoo and Scene Text Spotting
Banerjee, Ayan; Palaiahnakote, Shivakumara; Antonacopoulos, Apostolos; Pal, Umapada; Lu, Tong; Canet, Josep Llados
Authors
Dr Shivakumara Palaiahnakote S.Palaiahnakote@salford.ac.uk
Lecturer
Prof Apostolos Antonacopoulos A.Antonacopoulos@salford.ac.uk
Professor
Umapada Pal
Tong Lu
Josep Llados Canet
Abstract
Text spotting in natural scenes is of increasing interest and significance due to its critical role in several applications, such as visual question answering, named entity recognition and event rumor detection on social media. One of the newly emerging challenging problems is Tattoo Text Spotting (TTS) in images for assisting forensic teams and for person identification. Unlike the generally simpler scene text addressed by current state-of-the-art methods, tattoo text is typically characterized by the presence of decorative backgrounds, calligraphic handwriting and several distortions due to the deformable nature of the skin. This paper describes the first approach to address TTS in a real-world application context by designing an end-to-end text spotting method employing a Hilbert transform-based Generative Adversarial Network (GAN). To reduce the complexity of the TTS task, the proposed approach first detects fine details in the image using the Hilbert transform and the Optimum Phase Congruency (OPC). To overcome the challenges of only having a relatively small number of training samples, a GAN is then used for generating suitable text samples and descriptors for text spotting (i.e. both detection and recognition). The superior performance of the proposed TTS approach, for both tattoo and general scene text, over the state-of-the-art methods is demonstrated on a new TTS-specific dataset (publicly available 1) as well as on the existing benchmark natural scene text datasets: Total-Text, CTW1500 and ICDAR 2015.
Citation
Banerjee, A., Palaiahnakote, S., Antonacopoulos, A., Pal, U., Lu, T., & Canet, J. L. (2024). TTS: Hilbert Transform-based Generative Adversarial Network for Tattoo and Scene Text Spotting. IEEE Transactions on Multimedia, 1-15. https://doi.org/10.1109/tmm.2024.3378458
Journal Article Type | Article |
---|---|
Acceptance Date | Mar 3, 2024 |
Publication Date | Mar 29, 2024 |
Deposit Date | Mar 9, 2024 |
Publicly Available Date | Mar 29, 2024 |
Journal | IEEE Transactions on Multimedia |
Print ISSN | 1520-9210 |
Publisher | Institute of Electrical and Electronics Engineers |
Peer Reviewed | Peer Reviewed |
Pages | 1-15 |
DOI | https://doi.org/10.1109/tmm.2024.3378458 |
Keywords | Electrical and Electronic Engineering, Computer Science Applications, Media Technology, Signal Processing |
Files
Accepted Version
(1.6 Mb)
PDF
Copyright Statement
© 20XX IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
You might also like
A survey of OCR evaluation tools and metrics
(2021)
Conference Proceeding
VISE : an interface for Visual Search and Exploration of museum collections
(2019)
Journal Article
Efficient and effective OCR engine training
(2019)
Journal Article
Downloadable Citations
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2024
Advanced Search