Skip to main content

Research Repository

Advanced Search

An Episodic Learning Network for Text Detection on Human Bodies in Sports Images

Nath Chowdhury, Pinaki; Shivakumara, Palaiahnakote; Raghavendra, Ramachandra; Nag, Sauradip; Pal, Umapada; Lu, Tong; Lopresti, Daniel


Pinaki Nath Chowdhury

Ramachandra Raghavendra

Sauradip Nag

Umapada Pal

Tong Lu

Daniel Lopresti


Due to the proliferation of sports-related multimedia content on the WWW, effective visual search and retrieval present interesting research challenges. These are caused by poor image quality, a wide range of possible camera points of view, pose variations on the part of athletes engaged in playing a sport, deformations of text appearing on sports person’s clothing and uniforms in motion, occlusions caused by other objects, etc. To address these challenges, this paper presents a new method for detecting text on human bodies in sports images. Unlike most existing methods, which attempt to exploit locations of a player’s torso, face, and skin, we propose an end-to-end episodic learning approach that employs inductive learning criteria for detecting clothing regions in an image, which are, in turn, then used for text detection. Our method integrates a Residual Network (ResNet) and Pyramidal Pooling Module (PPM) for generating a spatial attention map. The Progressive Scalable Expansion Algorithm (PSE) is adapted for text detection from these regions. Experimental results on our own dataset as well as several benchmarks (like RBNR and MMM which contain images of runners in marathons, and Re-ID which is a person re-identification dataset) demonstrate that the proposed method outperforms existing methods in terms of precision and F1-score. We also present results for sports images chosen from natural scene text detection datasets such as CTW1500 and MS-COCO to show the proposed method is effective and reliable across a range of inputs.


Nath Chowdhury, P., Shivakumara, P., Raghavendra, R., Nag, S., Pal, U., Lu, T., & Lopresti, D. (2022). An Episodic Learning Network for Text Detection on Human Bodies in Sports Images. IEEE Transactions on Circuits and Systems for Video Technology, 32, 2279 - 2289.

Journal Article Type Article
Acceptance Date Jun 18, 2021
Publication Date 2022-04
Deposit Date Feb 2, 2024
Journal IEEE Transactions on Circuits and Systems for Video Technology
Print ISSN 1051-8215
Publisher Institute of Electrical and Electronics Engineers
Peer Reviewed Peer Reviewed
Volume 32
Pages 2279 - 2289