Toward an Efficient Hybrid Clustering and Classification Approach for Fake News Detection in Social Network

Wafa Alharthi  lowast; Mohammad Eid Alzahrani

doi:10.17485/IJST/v17i26.579

Article

Toward an Efficient Hybrid Clustering and Classification Approach for Fake News Detection in Social Network

VIEWS 208
PDF 42

Indian Journal of Science and Technology

DOI: 10.17485/IJST/v17i26.579

Year: 2024, Volume: 17, Issue: 26, Pages: 2754-2762

Original Article

Toward an Efficient Hybrid Clustering and Classification Approach for Fake News Detection in Social Network

Wafa Alharthi^1∗, Mohammad Eid Alzahrani¹

¹Department of Computer Science, Faculty of Computing & Information, Al-Baha University, Al-Baha, Saudi Arabia

*Corresponding Author
Email: [email protected]

Received Date:28 February 2024, Accepted Date:18 June 2024, Published Date:06 July 2024

This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

Objectives: The spread of fake news on social media has become a pressing issue in recent times. Despite various organizations' efforts to address this problem, it continues to persist, necessitating finding more effective solutions. This study implements a machine learning-based approach for identifying fake news on social media with improved accuracy. Methods : The study's methodology utilizes a combination of Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) to extract semantic features from news articles. Then, a hybrid clustering and classification approach is used, which combines K-means clustering and Artificial Neural Network (ANN) classifiers. Data used is a secondary data consisting of 23481 world news articles obtained from the Reuters.com website and 21417 unreliable news from polifact.com. During the training process, the number of units and layers on the model was tuned to optimize model performance. The model was compared to other baseline models such as KNN, SVM, Decision tree, and Boosted decision tree to establish the best performing model. Findings: The results from the two algorithms were weighted to final classifications using Bayesian probability theory. The proposed approach achieved an accuracy of 99.78%, a sensitivity of 100%, and specificity equal to 99.73%. The model's precision is 99.74%, indicating its ability to identify fake news. The F-score of the approach is 99.87%, indicating that the model strikes a good balance between correctly classifying fake news articles and reliable news articles. The approach outperformed other machine learning classifiers, including KNN, SVM, Decision Tree, and Boosted Decision Tree. Novelty : The study applies a hybrid approach with a classification and clustering algorithm to improve detection of fake news on social media, the approach is tested with varied real-world datasets to establish its robustness under different vocabularies and vocabulary sizes.

Keywords: Artificial Intelligence, Machine Learning, Deep Learning, Hybrid clustering approach, Classification, CNN, LSTM, Boosted Decision Tree, Social platforms, K-means

References

Collins B, Hoang DT, Nguyen NT, Hwang D, Chen BC, Wu Z, et al. Efficient Object Embedding for Spliced Image Retrieval. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). . p. 14960–14970.
Lee T. The global rise of “fake news” and the threat to democratic elections in the USA. Public Administration and Policy. 2019;22(1):15–24. Available from: http://dx.doi.org/10.1108/pap-04-2019–0008.
Metaverse. Here’s how we’re using AI to help detect misinformation [Internet]. . 2020. Available from: https://ai.facebook.com/blog/heres-how-were-using-ai-to-help-detect-misinformation/
Wiggers K. Facebook’s improved AI isn’t preventing harmful content from spreading. 2020. Available from: https://venturebeat.com/ai/facebooks-improved-ai-isnt-preventing-harmful-content-from-spreading/
Agrawal P. Twitter acquires Fabula AI to strengthen its machine learning expertise. 2019. Available from: https://blog.twitter.com/en_us/topics/company/2019/Twitter-acquires-Fabula-AI
Shahi GK, Dirkson A, Majchrzak TA. An exploratory study of COVID-19 misinformation on Twitter. Online social networks and media. 2021;22:100104. Available from: https://doi.org/10.1016/j.osnem.2020.100104
Ozbay FA, Alatas B. Fake news detection within online social media using supervised artificial intelligence algorithms. Physica A: Statistical Mechanics and its Applications. 2020;540. Available from: http://dx.doi.org/10.1016/j.physa.2019.123174
Braşoveanu AM, Andonie R. Semantic Fake News Detection: A Machine Learning Perspective. Advances in Computational Intelligence. 2019;p. 656–667. Available from: http://dx.doi.org/10.1007/978-3-030-20521-8_54
Zhang J, Dong B, Yu PS. FakeDetector: Effective Fake News Detection with Deep Diffusive Neural Network. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE). .
Siddik AB. Fake and true news dataset. 2020. Available from: https://figshare.com/articles/dataset/Fake_and_True_News_Dataset/13325198
Lexical Tools . Available from: https://lhncbc.nlm.nih.gov/LSG/Projects/lvg/current/docs/designDoc/UDF/unicode/NormOperations/mapSymbolToAscii.html
Pyrovolakis K, Tzouveli P, Stamou G. Multi-Modal Song Mood Detection with Deep Learning. Sensors. 2022;22(3):1065. Available from: https://doi.org/10.3390/s22031065
Mardaoui D, Garreau D. An Analysis of LIME for Text Data . Available from: https://arxiv.org/abs/2010.12487
Nazar S, Bustam MR. Artificial Intelligence and New Level of Fake News. In: IOP Conference Series: Materials Science and Engineering, 1. 879:2–5.
Zhou X, Zafarani R. Fundamental Theories, Detection Methods, and Opportunities. ACM Computing Surveys. 2020;53(5):1–40. Available from: http://dx.doi.org/10.1145/3395046
Risdal M. Getting Real about Fake News. Available from: https://www.kaggle.com/datasets/mrisdal/fake-news
Sekeroglu B, Abiyev R, Ilhan A, Arslan M, Idoko JB. Systematic Literature Review on Machine Learning and Student Performance Prediction: Critical Gaps and Possible Remedies. Applied Sciences. 2021;11(22):10907. Available from: http://dx.doi.org/10.3390/app112210907
Khurana D, Koli A, Khatter K, Singh S. Natural language processing: state of the art, current trends and challenges. Multimedia tools and applications. 2022;82(3):3713–3757. Available from: https://doi.org/10.1007/s11042-022-13428-4
Schoot RVD, Bruin JD, Schram R, Zahedi P, Boer JD, Weijdema F, et al. An open source machine learning framework for efficient and transparent systematic reviews. Nature Machine Intelligence. 2021;3(2):125–133. Available from: http://dx.doi.org/10.1038/s42256-020-00287–7
Molina MD, Sundar SS, Le T, Lee D. “Fake News” Is Not Simply False Information: A Concept Explication and Taxonomy of Online Content. American Behavioral Scientist. 2021;65(2):180–212. Available from: http://dx.doi.org/10.1177/0002764219878224
Baptista J, Gradim A. A working definition of fake news. Encyclopedia. 2022;2(1):632–677. Available from: https://doi.org/10.3390/encyclopedia2010043
Paor SD, Heravi B. Information literacy and fake news: How the field of librarianship can help combat the epidemic of fake news. The Journal of academic librarianship. 2020;46:102218. Available from: https://doi.org/10.1016/j.acalib.2020.102218

Copyright

© 2024 Alharthi & Alzahrani. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)