Pinaki Nath Chowdhury
An Episodic Learning Network for Text Detection on Human Bodies in Sports Images
Nath Chowdhury, Pinaki; Shivakumara, Palaiahnakote; Raghavendra, Ramachandra; Nag, Sauradip; Pal, Umapada; Lu, Tong; Lopresti, Daniel
Authors
Dr Shivakumara Palaiahnakote S.Palaiahnakote@salford.ac.uk
Lecturer in Computer Vision
Ramachandra Raghavendra
Sauradip Nag
Umapada Pal
Tong Lu
Daniel Lopresti
Abstract
Due to the proliferation of sports-related multimedia content on the WWW, effective visual search and retrieval present interesting research challenges. These are caused by poor image quality, a wide range of possible camera points of view, pose variations on the part of athletes engaged in playing a sport, deformations of text appearing on sports person’s clothing and uniforms in motion, occlusions caused by other objects, etc. To address these challenges, this paper presents a new method for detecting text on human bodies in sports images. Unlike most existing methods, which attempt to exploit locations of a player’s torso, face, and skin, we propose an end-to-end episodic learning approach that employs inductive learning criteria for detecting clothing regions in an image, which are, in turn, then used for text detection. Our method integrates a Residual Network (ResNet) and Pyramidal Pooling Module (PPM) for generating a spatial attention map. The Progressive Scalable Expansion Algorithm (PSE) is adapted for text detection from these regions. Experimental results on our own dataset as well as several benchmarks (like RBNR and MMM which contain images of runners in marathons, and Re-ID which is a person re-identification dataset) demonstrate that the proposed method outperforms existing methods in terms of precision and F1-score. We also present results for sports images chosen from natural scene text detection datasets such as CTW1500 and MS-COCO to show the proposed method is effective and reliable across a range of inputs.
Citation
Nath Chowdhury, P., Shivakumara, P., Raghavendra, R., Nag, S., Pal, U., Lu, T., & Lopresti, D. (2022). An Episodic Learning Network for Text Detection on Human Bodies in Sports Images. IEEE Transactions on Circuits and Systems for Video Technology, 32, 2279 - 2289. https://doi.org/10.1109/TCSVT.2021.3092713
Journal Article Type | Article |
---|---|
Acceptance Date | Jun 18, 2021 |
Publication Date | 2022-04 |
Deposit Date | Feb 2, 2024 |
Journal | IEEE Transactions on Circuits and Systems for Video Technology |
Print ISSN | 1051-8215 |
Publisher | Institute of Electrical and Electronics Engineers |
Peer Reviewed | Peer Reviewed |
Volume | 32 |
Pages | 2279 - 2289 |
DOI | https://doi.org/10.1109/TCSVT.2021.3092713 |
You might also like
A Newly Adopted YOLOv9 Model for Detecting Mould Regions Inside of Buildings
(2024)
Journal Article
Spatial-Frequency Based EEG Features for Classification of Human Emotions
(2024)
Journal Article
Downloadable Citations
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search