Said Salloum
Clustering Medical Transcriptions Using K -Means
Salloum, Said; Tahat, Dina; Tahat, Khalaf; Alfaisal, Raghad; Salloum, Ayham
Authors
Dina Tahat
Khalaf Tahat
Raghad Alfaisal
Ayham Salloum
Abstract
The clustering of medical transcriptions is an essential task for the categorization and summarization of large volumes of medical records. This paper explores the efficacy of k-means clustering, a well-known unsupervised machine learning algorithm, to discern patterns and segregate medical transcriptions into distinct clusters. We processed a dataset comprising various medical reports, systematically cleaning and preparing the text for analysis. By employing a Term Frequency-Inverse Document Frequency (TF-IDF) approach, we converted the textual data into a vectorized format amenable to machine learning methods. Subsequent dimensionality reduction through Principal Component Analysis (PCA) facilitated the visualization and interpretation of the high-dimensional data in two-dimensional space. The k-means algorithm was then applied, revealing five distinct clusters. Each cluster was characterized by examining the prevalence of key terms, uncovering thematic consistencies that may correspond to particular medical procedures or specialties. The resulting clusters demonstrate the algorithm's potential to automatically categorize medical documentation in a way that mirrors clinical relevance, thereby providing a foundation for improved information management systems in healthcare settings.
Presentation Conference Type | Conference Paper (published) |
---|---|
Conference Name | 2024 International Conference on Intelligent Computing, Communication, Networking and Services (ICCNS) |
Start Date | Sep 24, 2024 |
End Date | Sep 27, 2024 |
Publication Date | Sep 24, 2024 |
Deposit Date | Feb 5, 2025 |
Peer Reviewed | Peer Reviewed |
Pages | 291-294 |
Book Title | 2024 International Conference on Intelligent Computing, Communication, Networking and Services (ICCNS) |
ISBN | 9798350354706 |
DOI | https://doi.org/10.1109/iccns62192.2024.10776237 |
You might also like
The adoption of metaverse systems: a hybrid SEM - ML method
(2022)
Presentation / Conference