Leveraging K-Means Clustering for Analysis of Arabic Hate Speech Tweets

Salloum, Said; Tahat, Khalaf; Mansoori, Ahmed; Alfaisal, Raghad; Tahat, Dina

doi:10.1109/gcet64327.2024.10934641

Leveraging K-Means Clustering for Analysis of Arabic Hate Speech Tweets

Salloum, Said; Tahat, Khalaf; Mansoori, Ahmed; Alfaisal, Raghad; Tahat, Dina

Authors

Said Salloum

Khalaf Tahat

Ahmed Mansoori

Raghad Alfaisal

Dina Tahat

Abstract

As hate speech is becoming common on social media platforms, it is important to detect, and curb hate speech in order to provide a better and safe environment online. Given the heavy usage of manual methods of hate speech detection, researchers started putting efforts in the direction of machine-learning-based automated methods sooner or later. Many available datasets and models on hate speech detection are largely inadequate for Arab hate speech because of the complexity of language and cultural nuances. In this paper, the researchers eased these difficulties, as they, used the K-Means to apply compilation for Arab hate speech in the L-HSAB dataset on the tweets. Methodology consisted of Prospecting and Pre-processing, Term Frequency Inverse Document frequency (TF-IDF) dimension reductions, through Principal Component Analysis (PCA), and assembly via K-Means. They helped to identify different sets of hate speech tweets and understand the common topics and topics. It has led to new understandings of how Arab hate speech flows on the Internet and now offers the potential for tailored interventions. Automated hate speech analysis via machine learning would allow policymakers to formulate tailored modification strategies focused on making the Internet safer and the community more harmonious.

Presentation Conference Type	Conference Paper (published)
Conference Name	Global Congress on Emerging Technologies (GCET-2024)
Start Date	Dec 9, 2024
End Date	Dec 11, 2024
Publication Date	Dec 9, 2024
Deposit Date	Apr 16, 2025
Peer Reviewed	Peer Reviewed
Pages	282-285
Book Title	Global Congress on Emerging Technologies (GCET-2024)
ISBN	979-8-3315-4261-0
DOI	https://doi.org/10.1109/gcet64327.2024.10934641

The adoption of metaverse systems: a hybrid SEM - ML method (2022)
Presentation / Conference

Determinants predicting the electronic medical record adoption in healthcare: A SEM-Artificial Neural Network approach (2022)
Journal Article

A systematic literature review on phishing email detection using natural language processing techniques (2022)
Journal Article

Prediction of User’s Intention to Use Metaverse System in Medical Education: A Hybrid SEM-ML Learning Approach (2022)
Journal Article

SEM-ANN-based approach to understanding students' academic-performance adoption of YouTube for learning during Covid. (2022)
Journal Article

Leveraging K-Means Clustering for Analysis of Arabic Hate Speech Tweets

Salloum, Said; Tahat, Khalaf; Mansoori, Ahmed; Alfaisal, Raghad; Tahat, Dina

Authors

Abstract

You might also like

Downloadable Citations