Arnab Halder
A New Unsupervised Approach for Text Localization in Shaky and Non-shaky Scene Video
Halder, Arnab; Palaiahnakote, Shivakumara; Pal, Umapada; Blumenstein, Michael; Liu, Cheng-Lin
Authors
Dr Shivakumara Palaiahnakote S.Palaiahnakote@salford.ac.uk
Lecturer
Umapada Pal
Michael Blumenstein
Cheng-Lin Liu
Abstract
Text Detection in shaky and non-shaky videos is challenging due to poor video quality and the presence of static and dynamic obstacles. Video captured by a shaky camera due to wind is considered shaky video, while video captured by a fixed camera is considered as non-shaky video. Most state-of-the-art methods achieve the best results when exploring the concept of deep learning. The present study proposes an unsupervised approach for text spotting in shaky and non-shaky videos. In the first stage, our method selects keyframes from the input video by estimating the similarity between the temporal frames, which we named activation frames. For each activation frame, the proposed method extracts statistical features such as orientation, spectral, edge density and intensity features that represent text information. The extracted features are fed to a K-means clustering method to obtain the text clusters, which results in text regions in the activation frames. For each region, the proposed method uses optical flow to extract spatial consistency, motion consistency and depth map consistency for localizing text using temporal voting-non-maximum suppression. Experiments are conducted on our shaky and non-shaky dataset, and the benchmark dataset of ICDAR 2015. For the experiments it can be seen that the proposed method is superior to existing methods.
Citation
Halder, A., Palaiahnakote, S., Pal, U., Blumenstein, M., & Liu, C.-L. (2024). A New Unsupervised Approach for Text Localization in Shaky and Non-shaky Scene Video. . https://doi.org/10.1007/978-3-031-70549-6_10
Conference Name | Document Analysis and Recognition - ICDAR 2024 |
---|---|
Conference Location | Athens, Greece |
Start Date | Aug 30, 2024 |
End Date | Sep 4, 2024 |
Acceptance Date | Aug 30, 2024 |
Online Publication Date | Sep 9, 2024 |
Publication Date | 2024 |
Deposit Date | Nov 15, 2024 |
Publicly Available Date | Sep 10, 2025 |
Publisher | Springer |
Series ISSN | 0302-9743 |
ISBN | 978-3-031-70548-9 |
DOI | https://doi.org/10.1007/978-3-031-70549-6_10 |
Files
This file is under embargo until Sep 10, 2025 due to copyright reasons.
Contact S.Palaiahnakote@salford.ac.uk to request a copy for personal use.
You might also like
An Adaptive Xception Model for Classification of Brain Tumors
(2024)
Journal Article
Altered Handwritten Text Detection in Document Images Using Deep Learning
(2024)
Journal Article
NDOrder: Exploring a Novel Decoding Order for Scene Text Recognition
(2024)
Journal Article
Downloadable Citations
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search