• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology


Indian Journal of Science and Technology

Year: 2022, Volume: 15, Issue: 11, Pages: 481-488

Original Article

URL Redirection Attack Mitigation in Social Communication Platform using Data Imbalance Aware Machine Learning Algorithm

Received Date:29 September 2021, Accepted Date:10 February 2022, Published Date:22 March 2022


Objectives: To present a model which can detect malicious attacks using the URL of the Social Communication Platform using the data imbalance machine learning algorithm. The main objective is to detect the attack and prevent it from happening. Methods: This study presents an efficient feature extraction and selection method addressing feature imbalance problems; and also presents an improved concept drift and machine learning-based classification. This paper extracts the URL of the undesired tweets, identifies them, and filters them for classification. Findings: The experiments have been conducted using the drifted twitter spam dataset. Our model DIA-XGBoost extracts the URL of the undesired tweets, identifies them, and filters them for classification. Further, the attack pattern varies with respect to time. Furthermore, the results show that our DIA-XGBoost attains higher accuracy performance by 1.254%, URL recall performance by 0.14%, and increased Fmeasure performance by 10% when compared with the existing ML techniques (Random Forest, K-Nearest Neighbour, XGBoost). Thus, the existing ML-based classification model achieves poor classification accuracy whereas our model solves this issue. Novelty: Various Machine Learning (ML) techniques have been applied for the classification of URL redirection attacks. However, the spam data generally exhibit feature imbalance. Further, the attack pattern varies with respect to time. Thus, the existing ML-based classification model achieves poor classification accuracy. Hence, our model solves the issue using the DIA-XGBoost algorithm, detects and prevents URL malicious attacks.

Keywords: Data Imbalance; Feature Extraction; Concept Drift; URL; Machine Learning; URL Redirection Attack


  1. Mukunthana B, Arunkrishnab M. Detection of Malicious Data in Twitter Using Machine Learning Approaches. Turkish Journal of Computer and Mathematics Education. 2021;12(3):4951–4958. Available from: https://doi.org/10.17762/turcomat.v12i3.2008
  2. Sahoo SR, Gupta BB. Classification of spammer and nonspammer content in online social network using genetic algorithm-based feature selection. Enterprise Information Systems. 2020:1–27. Available from: https://doi.org/10.1080/17517575.2020.1712742
  3. Lalitha LA, Hulipalled VR. Adaptive k-Nearest Centroid Neighbor Classifier for Detecting Drifted Twitter Spam. International Journal of Engineering and Advanced Technology (IJEAT). 2019;8:235–243. Available from: https://www.ijeat.org/wp-content/uploads/papers/v8i5S/E10480585S19.pdf
  4. Singhal S, Chawla U, Shorey R. Machine Learning & Concept Drift based Approach for Malicious Website Detection. 2020 International Conference on COMmunication Systems & NETworkS (COMSNETS). 2020. doi: 10.1109/COMSNETS48256.2020.9027485
  5. Setha S, Singha G. Kuljit Kaur Chahala. Drift-based approach for evolving data stream classification in Intrusion detection system. Workshop on Computer Networks & Communications. 2021;p. 23–30. Available from: http://ceur-ws.org/Vol-2889/PAPER_03.pdf
  6. Mehmood H, Kostakos P, Cortes M, Anagnostopoulos T, Pirttikangas S, Gilman E. Concept Drift Adaptation Techniques in Distributed Environment for Real-World Data Streams. Smart Cities. 2021;4(1):349–371. Available from: https://dx.doi.org/10.3390/smartcities4010021
  7. Adnan A, Muhammed A, Ghani AAA. Azizol Abdullahand Fahrul Hakim. An Intrusion Detection System System for the Internet of Things Based on. Machine Learning: Reviews and Challenges. Symmetry. 2021;p. 1–13. Available from: https://doi.org/10.3390/sym13061011
  8. Museba T, Nelwamondo F, Ouahada K, Akinola A. Recurrent Adaptive Classifier Ensemble for Handling Recurring Concept Drifts. Hindwai Applied Computational Intelligence and Soft Computing. 2021;p. 1–13. Available from: https://doi.org/10.1155/2021/5533777
  9. Dai Y, Li H, Qian Y, Guo Y, Zheng M. Anticoncept Drift Method for Malware Detector Based on Generative Adversarial Network. Security and Communication Networks. 2021;2021:1–12. Available from: https://dx.doi.org/10.1155/2021/6644107
  10. Museba T, Nelwamondo F, Ouahada K. An Adaptive Heterogeneous Online Learning Ensemble Classifier for Nonstationary Environments. Hindwai Computational Intelligence and Neuroscience. 2021;p. 1–11. Available from: https://doi.org/10.1155/2021/6669706
  11. Dizaj EA, Kashani MM. Nonlinear structural performance and seismic fragility of corroded reinforced concrete structures: modelling guidelines. European Journal of Environmental and Civil Engineering. 2021;p. 1–30. Available from: https://dx.doi.org/10.1080/19648189.2021.1896582
  12. Schwengber BH, Vergutz A, Prates NG, Nogueira M. A Method Aware of Concept Drift for Online Botnet Detection. GLOBECOM 2020 - 2020 IEEE Global Communications Conference. 2020;1. doi: 10.1109/GLOBECOM42002.2020.9347990
  13. Henke M, Santos E, Souto E, Santin AO. Spam Detection Based on Feature Evolution to Deal with Concept Drift. JUCS - Journal of Universal Computer Science. 2021;27(4):364–386. Available from: https://dx.doi.org/10.3897/jucs.66284
  14. Korycki L, Krawczyk B. Concept Drift Detection from Multi-Class Imbalanced Data Streams. 2021 IEEE 37th International Conference on Data Engineering (ICDE). 2021;p. 1068–1079. Available from: arXiv:2104.10228
  15. Washha M, Qaroush A, Mezghani M, Sedes F. Unsupervised collective-based framework for dynamic retraining of supervised real-time spam tweets detection model. Expert Systems with Applications. 2019;135:129–152. Available from: https://dx.doi.org/10.1016/j.eswa.2019.05.052
  16. Masood F, Ammad G, Almogren A, Abbas A, Khattak HA, Din IU, et al. Spammer Detection and Fake User Identification on Social Networks. IEEE Access. 2019;7:68140–68152. Available from: https://dx.doi.org/10.1109/access.2019.2918196
  17. Wang X, Kang Q, An J, Zhou M. Drifted Twitter Spam Classification Using Multiscale Detection Test on K-L Divergence. IEEE Access. 2019;7:108384–108394. Available from: https://dx.doi.org/10.1109/access.2019.2932018
  18. Alrubaian M, Al-Qurishi M, Alamri A, Al-Rakhami M, Hassan MM, Fortino G. Credibility in Online Social Networks: A Survey. IEEE Access. 2019;7:2828–2855. Available from: https://dx.doi.org/10.1109/access.2018.2886314
  19. Chen C, Wang Y, Zhang J, Xiang Y, Zhou W, Min G. Statistical Features-Based Real-Time Detection of Drifted Twitter Spam. IEEE Transactions on Information Forensics and Security. 2017;12(4):914–925. Available from: https://dx.doi.org/10.1109/tifs.2016.2621888


© 2022 Patil & Dinesha. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Published By Indian Society for Education and Environment (iSee)


Subscribe now for latest articles and news.