Rakesh Dey
SFTT: A Spatial-Frequency-Temporal-Based End-to-End Transformer for Heart Rate Estimation
Dey, Rakesh; Palaiahnakote, Shivakumara; Bhattacharya, Saumik; Pal, Umapada; Chanda, Sukalpa
Authors
Dr Shivakumara Palaiahnakote S.Palaiahnakote@salford.ac.uk
Lecturer in Computer Vision
Saumik Bhattacharya
Umapada Pal
Sukalpa Chanda
Abstract
Vision-based Heart Rate (HR) estimation in adverse situations, such as changes in skin tone, arbitrary face movements, and complex backgrounds, etc., is challenging. Unlike state-of-the-art models that use color and spatial-temporal information, the present work exploits a Spatial-Frequency-Temporal Transformer (SFTT) for heart rate estimation. For extracting multi-scale contextual features, we propose an end-to-end transformer that encodes contextual information through a pyramidal structure-based approach. Furthermore, to strengthen the features, the proposed model introduces a new attention approach that performs mutual-sharing operations between spatial-temporal and frequency-temporal domains in an end-to-end fashion. Experimental results on four standard datasets, namely UBFC-rPPG, VIPL-HR, OBF, and MMSE-HR, show that the proposed model is generic and invariant to the aforementioned challenges. Further, a comparative study with the state-of-the-art models demonstrates the effectiveness of the proposed method over the existing methods on all four benchmark datasets. Besides, experiments on cross-dataset validation show that the proposed method is reliable and robust.
Journal Article Type | Article |
---|---|
Acceptance Date | May 15, 2025 |
Online Publication Date | Jul 3, 2025 |
Deposit Date | Jul 4, 2025 |
Publicly Available Date | Jul 7, 2025 |
Journal | IEEE Transactions on Emerging Topics in Computational Intelligence |
Print ISSN | 2168-6750 |
Publisher | Institute of Electrical and Electronics Engineers |
Peer Reviewed | Peer Reviewed |
DOI | https://doi.org/10.1109/TETCI.2025.3582841 |
Files
Accepted Version
(1.3 Mb)
PDF
Publisher Licence URL
http://creativecommons.org/licenses/by/4.0/
You might also like
UR2P-Dehaze: Learning a Simple Image Dehaze Enhancer via Unpaired Rich Physical Prior
(2025)
Journal Article
EPAD: Ethereum phishing scam detection via graph contrastive learning
(2025)
Journal Article
Downloadable Citations
About USIR
Administrator e-mail: library-research@salford.ac.uk
This application uses the following open-source libraries:
SheetJS Community Edition
Apache License Version 2.0 (http://www.apache.org/licenses/)
PDF.js
Apache License Version 2.0 (http://www.apache.org/licenses/)
Font Awesome
SIL OFL 1.1 (http://scripts.sil.org/OFL)
MIT License (http://opensource.org/licenses/mit-license.html)
CC BY 3.0 ( http://creativecommons.org/licenses/by/3.0/)
Powered by Worktribe © 2025
Advanced Search