Skip to main content

Research Repository

Advanced Search

All Outputs (13)

A New Unsupervised Approach for Text Localization in Shaky and Non-shaky Scene Video (2024)
Conference Proceeding
Halder, A., Palaiahnakote, S., Pal, U., Blumenstein, M., & Liu, C.-L. (2024). A New Unsupervised Approach for Text Localization in Shaky and Non-shaky Scene Video. . https://doi.org/10.1007/978-3-031-70549-6_10

Text Detection in shaky and non-shaky videos is challenging due to poor video quality and the presence of static and dynamic obstacles. Video captured by a shaky camera due to wind is considered shaky video, while video captured by a fixed camera is... Read More about A New Unsupervised Approach for Text Localization in Shaky and Non-shaky Scene Video.

EAU-Net: A New Edge-Attention Based U-Net for Nationality Identification (2022)
Conference Proceeding
Pal Choudhury, A., Shivakumara, P., Pal, U., & Liu, C.-L. (2022). EAU-Net: A New Edge-Attention Based U-Net for Nationality Identification. In Frontiers in Handwriting Recognition 18th International Conference, ICFHR 2022, Hyderabad, India, December 4–7, 2022, Proceedings (137-152). https://doi.org/10.1007/978-3-031-21648-0_10

Identifying crime or individuals is one of the key tasks toward smart and safe city development when different nationals are involved. In this regard, identifying Nationality/Ethnicity through handwriting has received special attention. But due to fr... Read More about EAU-Net: A New Edge-Attention Based U-Net for Nationality Identification.

ARNet: Active-Reference Network for Few-Shot Image Semantic Segmentation (2021)
Conference Proceeding
Shi, G., Wu, Y., Palaiahnakote, S., Pal, U., & Lu, T. (2021). ARNet: Active-Reference Network for Few-Shot Image Semantic Segmentation. In 2021 IEEE International Conference on Multimedia and Expo (ICME). https://doi.org/10.1109/ICME51207.2021.9428425

To make predictions on unseen classes, few-shot segmentation becomes a research focus recently. However, most methods build on pixel-level annotation requiring quantity of manual work. Moreover, inherent information on same-category objects to guide... Read More about ARNet: Active-Reference Network for Few-Shot Image Semantic Segmentation.

A New Method for Detecting Altered Text in Document Images (2020)
Conference Proceeding
Nandanwar, L., Shivakumara, P., Pal, U., Lu, T., Lopresti, D., Seraogi, B., & B. Chaudhuri, B. (2020). A New Method for Detecting Altered Text in Document Images. In Pattern Recognition and Artificial Intelligence (93-108). https://doi.org/10.1007/978-3-030-59830-3_8

As more and more office documents are captured, stored, and shared in digital format, and as image editing software becomes increasingly more powerful, there is a growing concern about document authenticity. For example, texts in property documents c... Read More about A New Method for Detecting Altered Text in Document Images.

A New DCT-FFT Fusion Based Method for Caption and Scene Text Classification in Action Video Images (2020)
Conference Proceeding
Nandanwar, L., Shivakumara, P., Manna, S., Pal, U., Lu, T., & Blumenstein, M. (2020). A New DCT-FFT Fusion Based Method for Caption and Scene Text Classification in Action Video Images. In Pattern Recognition and Artificial Intelligence. https://doi.org/10.1007/978-3-030-59830-3_7

Achieving better recognition rate for text in video action images is challenging due to multi-type texts with unpredictable backgrounds. We propose a new method for the classification of captions (which is edited text) and scene texts (which is part... Read More about A New DCT-FFT Fusion Based Method for Caption and Scene Text Classification in Action Video Images.

A text-context-aware CNN network for multi-oriented and multi-language scene text detection (2020)
Conference Proceeding
Xiao, Y., Xue, M., Lu, T., Wu, Y., & Palaiahnakote, S. (2020). A text-context-aware CNN network for multi-oriented and multi-language scene text detection. In 2019 International Conference on Document Analysis and Recognition (ICDAR). https://doi.org/10.1109/ICDAR.2019.00116

The existing deep learning based state-of-theart scene text detection methods treat scene texts a type of general objects, or segment text regions directly. The latter category achieves remarkable detection results on arbitrary orientation and large... Read More about A text-context-aware CNN network for multi-oriented and multi-language scene text detection.

Compressing YOLO network by compressive sensing (2018)
Conference Proceeding
Wu, Y., Meng, Z., Palaiahnakote, S., & Lu, T. (2018). Compressing YOLO network by compressive sensing. In 2017 4th IAPR Asian Conference on Pattern Recognition (ACPR). https://doi.org/10.1109/ACPR.2017.11

Object detection is one of the fundamental challenges in pattern recognition community. Recently, convolutional neural networks (CNN) are increasingly exploited in object detection, showing their promising potentials of generatively discovering patte... Read More about Compressing YOLO network by compressive sensing.

Local and Global Bayesian Network based Model for Flood Prediction (2018)
Conference Proceeding
Wu, Y., Xu, W., Feng, J., Palaiahnakote, S., & Lu, T. (2018). Local and Global Bayesian Network based Model for Flood Prediction. In 2018 24th International Conference on Pattern Recognition (ICPR). https://doi.org/10.1109/ICPR.2018.8546257

To minimize the negative impacts brought by floods, researchers from pattern recognition community pay special attention to the problem of flood prediction by involving technologies of machine learning. In this paper, we propose to construct hierarch... Read More about Local and Global Bayesian Network based Model for Flood Prediction.

Em-SLAM: A Fast and Robust Monocular SLAM Method for Embedded Systems (2018)
Conference Proceeding
Wu, Y., Li, Z., Palaiahnakote, S., & Lu, T. (2018). Em-SLAM: A Fast and Robust Monocular SLAM Method for Embedded Systems. In 2018 24th International Conference on Pattern Recognition (ICPR). https://doi.org/10.1109/ICPR.2018.8545173

Simultaneous Localization and Mapping (SLAM) is difficult to deploy in the embedded systems due to its high computation cost and stable input requirements. Building on excellent algorithms of recent years, we present Em-SLAM, a monocular SLAM method... Read More about Em-SLAM: A Fast and Robust Monocular SLAM Method for Embedded Systems.

Context-Aware Attention LSTM Network for Flood Prediction (2018)
Conference Proceeding
Wu, Y., Liu, Z., Xu, W., Feng, J., Palaiahnakote, S., & Lu, T. (2018). Context-Aware Attention LSTM Network for Flood Prediction. In 2018 24th International Conference on Pattern Recognition (ICPR). https://doi.org/10.1109/ICPR.2018.8545385

To minimize the negative impacts brought by floods, researchers from pattern recognition community utilize artificial intelligence based methods to solve the problem of flood prediction. Inspired by the significant power of Long Short-Term Memory (LS... Read More about Context-Aware Attention LSTM Network for Flood Prediction.

Cloud of line distribution for arbitrary text detection in scene/video/license plate images (2018)
Conference Proceeding
Wang, W., Wu, Y., Palaiahnakote, S., Lu, T., & Liu, J. (2018). Cloud of line distribution for arbitrary text detection in scene/video/license plate images. In Advances in Multimedia Information Processing – PCM 2017 (443). https://doi.org/10.1007/978-3-319-77380-3_41

Detecting arbitrary oriented text in scene and license plate images is challenging due to multiple adverse factors caused by images of diversified applications. This paper proposes a novel idea of extracting Cloud of Line Distribution (COLD) for the... Read More about Cloud of line distribution for arbitrary text detection in scene/video/license plate images.

A Robust Symmetry-Based Method for Scene/Video Text Detection through Neural Network (2018)
Conference Proceeding
Wu, Y., Wang, W., Palaiahnakote, S., & Lu, T. (2018). A Robust Symmetry-Based Method for Scene/Video Text Detection through Neural Network. In 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR). https://doi.org/10.1109/ICDAR.2017.206

Text detection in video/scene images has gained a significant attention in the field of image processing and document analysis due to the inherent challenges caused by variations in contrast, orientation, background, text type, font type, non-uniform... Read More about A Robust Symmetry-Based Method for Scene/Video Text Detection through Neural Network.

Multi-oriented text detection for intra-frame in H.264/AVC video (2015)
Conference Proceeding
Minemura, K., Palaiahnakote, S., & Wong, K. (2015). Multi-oriented text detection for intra-frame in H.264/AVC video. In 2014 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS). https://doi.org/10.1109/ISPACS.2014.7024478

Text detection in compressed video has received much attention in recent years due to the effectiveness of DCT coefficients and motion vectors in realizing several applications. In this paper, a new text detection, which utilizes AC coefficients in t... Read More about Multi-oriented text detection for intra-frame in H.264/AVC video.