• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology

Article

Indian Journal of Science and Technology

Year: 2024, Volume: 17, Issue: 26, Pages: 2754-2762

Original Article

Toward an Efficient Hybrid Clustering and Classification Approach for Fake News Detection in Social Network

Received Date:28 February 2024, Accepted Date:18 June 2024, Published Date:06 July 2024

Abstract

Objectives: The spread of fake news on social media has become a pressing issue in recent times. Despite various organizations' efforts to address this problem, it continues to persist, necessitating finding more effective solutions. This study implements a machine learning-based approach for identifying fake news on social media with improved accuracy. Methods : The study's methodology utilizes a combination of Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) to extract semantic features from news articles. Then, a hybrid clustering and classification approach is used, which combines K-means clustering and Artificial Neural Network (ANN) classifiers. Data used is a secondary data consisting of 23481 world news articles obtained from the Reuters.com website and 21417 unreliable news from polifact.com. During the training process, the number of units and layers on the model was tuned to optimize model performance. The model was compared to other baseline models such as KNN, SVM, Decision tree, and Boosted decision tree to establish the best performing model. Findings: The results from the two algorithms were weighted to final classifications using Bayesian probability theory. The proposed approach achieved an accuracy of 99.78%, a sensitivity of 100%, and specificity equal to 99.73%. The model's precision is 99.74%, indicating its ability to identify fake news. The F-score of the approach is 99.87%, indicating that the model strikes a good balance between correctly classifying fake news articles and reliable news articles. The approach outperformed other machine learning classifiers, including KNN, SVM, Decision Tree, and Boosted Decision Tree. Novelty : The study applies a hybrid approach with a classification and clustering algorithm to improve detection of fake news on social media, the approach is tested with varied real-world datasets to establish its robustness under different vocabularies and vocabulary sizes.

Keywords: Artificial Intelligence, Machine Learning, Deep Learning, Hybrid clustering approach, Classification, CNN, LSTM, Boosted Decision Tree, Social platforms, K-means

References

  1. Collins B, Hoang DT, Nguyen NT, Hwang D, Chen BC, Wu Z, et al. Efficient Object Embedding for Spliced Image Retrieval. In: 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). . p. 14960–14970.
  2. Shahi GK, Dirkson A, Majchrzak TA. An exploratory study of COVID-19 misinformation on Twitter. Online social networks and media. 2021;22:100104. Available from: https://doi.org/10.1016/j.osnem.2020.100104
  3. Ozbay FA, Alatas B. Fake news detection within online social media using supervised artificial intelligence algorithms. Physica A: Statistical Mechanics and its Applications. 2020;540. Available from: http://dx.doi.org/10.1016/j.physa.2019.123174
  4. Braşoveanu AM, Andonie R. Semantic Fake News Detection: A Machine Learning Perspective. Advances in Computational Intelligence. 2019;p. 656–667. Available from: http://dx.doi.org/10.1007/978-3-030-20521-8_54
  5. Pyrovolakis K, Tzouveli P, Stamou G. Multi-Modal Song Mood Detection with Deep Learning. Sensors. 2022;22(3):1065. Available from: https://doi.org/10.3390/s22031065
  6. Mardaoui D, Garreau D. An Analysis of LIME for Text Data . Available from: https://arxiv.org/abs/2010.12487
  7. Zhou X, Zafarani R. Fundamental Theories, Detection Methods, and Opportunities. ACM Computing Surveys. 2020;53(5):1–40. Available from: http://dx.doi.org/10.1145/3395046
  8. Sekeroglu B, Abiyev R, Ilhan A, Arslan M, Idoko JB. Systematic Literature Review on Machine Learning and Student Performance Prediction: Critical Gaps and Possible Remedies. Applied Sciences. 2021;11(22):10907. Available from: http://dx.doi.org/10.3390/app112210907
  9. Khurana D, Koli A, Khatter K, Singh S. Natural language processing: state of the art, current trends and challenges. Multimedia tools and applications. 2022;82(3):3713–3757. Available from: https://doi.org/10.1007/s11042-022-13428-4
  10. Schoot RVD, Bruin JD, Schram R, Zahedi P, Boer JD, Weijdema F, et al. An open source machine learning framework for efficient and transparent systematic reviews. Nature Machine Intelligence. 2021;3(2):125–133. Available from: http://dx.doi.org/10.1038/s42256-020-00287–7
  11. Molina MD, Sundar SS, Le T, Lee D. “Fake News” Is Not Simply False Information: A Concept Explication and Taxonomy of Online Content. American Behavioral Scientist. 2021;65(2):180–212. Available from: http://dx.doi.org/10.1177/0002764219878224
  12. Baptista J, Gradim A. A working definition of fake news. Encyclopedia. 2022;2(1):632–677. Available from: https://doi.org/10.3390/encyclopedia2010043

Copyright

© 2024 Alharthi & Alzahrani. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)

DON'T MISS OUT!

Subscribe now for latest articles and news.