• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology

Article

Indian Journal of Science and Technology

Year: 2024, Volume: 17, Issue: 21, Pages: 2218-2231

Original Article

Mitigating Gradient-Based Data Poisoning Attacks on Machine Learning Models: A Statistical Detection Method

Received Date:02 April 2024, Accepted Date:12 May 2024, Published Date:29 May 2024

Abstract

Objectives: This research paper aims to develop a novel method for identifying gradient-based data poisoning attacks on industrial applications like autonomous vehicles and intelligent healthcare systems relying on machine learning and deep learning techniques. These algorithms performs well only if they are trained on good quality dataset. However, the ML models are prone to data poisoning attacks, targeting the training dataset, manipulate its input samples such that the machine learning algorithm gets confused and produces wrong predictions. The current detection techniques are effective to detect known attacks and lack generalized detection to unknown attacks. To address this issue, this paper aims to integrate security elements within the machine learning framework, guaranteeing effective identification and mitigation of known and unknown threats and achieve generalized detection. Methods: ML Filter, a unique attack detection approach integrates ML-Filter Detection Algorithm and the Statistical Perturbation Bounds Identification Algorithm to determine the given dataset is poisoned or not. DBSCAN algorithm is used to divide the dataset into several smaller subsets and perform algorithmic analysis for detection. The performance of the proposed method is evaluated in terms of True positive rate and significance test accuracy. Findings: The probability distribution differences between original and poisoned datasets vary with change in perturbation size rather than the datasets and ML models use for application. This finding lead to determine the perturbation bounds using statistical pairwise distance metrics and corresponding significance tests computed on the results. ML Filter demonstrates a high detection rate of 99.63% for known attacks and achieves a generalized detection accuracy of 98% for unknown attacks. Novelty: A secured ML architecture and a unique statistical detection approach ML-Filter, effectively detect data poisoning attacks, demonstrating significant advancements in detecting both known and unknown threats in industrial applications utilizing machine learning and deep learning algorithms.

Keywords: Privacy and security, Adversarial machine learning, Secured ML Architecture, ML-Filter, Statistical Perturbation Bounds Identification Algorithm

References

  1. Yerlikaya FA, Bahtiyar Ş. Data poisoning attacks against machine learning algorithms. Expert Systems with Applications. 2022;208. Available from: https://dx.doi.org/10.1016/j.eswa.2022.118101
  2. Kumar RSS, Nyström M, Lambert J, Marshall A, Goertzel M, Comissoneru A, et al. Adversarial Machine Learning-Industry Perspectives. In: 2020 IEEE Security and Privacy Workshops (SPW). IEEE. 2020.
  3. Macedo D, Ren TI, Zanchettin C, Oliveira ALI, Ludermir T. Entropic Out-of-Distribution Detection: Seamless Detection of Unknown Examples. IEEE Transactions on Neural Networks and Learning Systems. 2022;33(6):2350–2364. Available from: https://dx.doi.org/10.1109/tnnls.2021.3112897
  4. Craighero F, Angaroni F, Stella F, Damiani C, Antoniotti M, Graudenzi A. Unity is strength: Improving the detection of adversarial examples with ensemble approaches. Neurocomputing. 2023;554:1–14. Available from: https://dx.doi.org/10.1016/j.neucom.2023.126576
  5. Aliyu I, Engelenburg SV, Mu'Azu MB, Kim J, Lim CG. Statistical Detection of Adversarial Examples in Blockchain-Based Federated Forest In-Vehicle Network Intrusion Detection Systems. IEEE Access. 2022;10:109366–109384. Available from: https://dx.doi.org/10.1109/access.2022.3212412
  6. Cinà AE, Grosse K, Demontis A, Biggio B, Roli F, Pelillo M. Machine Learning Security Against Data Poisoning: Are We There Yet? Computer. 2024;57(3):26–34. Available from: https://dx.doi.org/10.1109/mc.2023.3299572
  7. Virtanen P, Gommers R, Oliphant TE, Haberland M, Reddy T, Cournapeau D, et al. Author Correction: SciPy 1.0: fundamental algorithms for scientific computing in Python. Nature Methods. 2020;17:1. Available from: https://dx.doi.org/10.1038/s41592-020-0772-5
  8. Wang Y, Li F, Sun H, Li W, Zhong C, Wu X, et al. Improvement of MNIST Image Recognition Based on CNN. In: 7th Annual International Conference on Geo-Spatial Knowledge and Intelligence , IOP Conference Series: Earth and Environmental Science. (Vol. 428, pp. 1-8) IOP Publishing. 2020.
  9. Zhang X, Tan H, Huang X, Zhang D, Tang K, Gu Z. Adversarial Attacks on ASR Systems: An Overview. 2022. Available from: https://doi.org/10.48550/arXiv.2208.02250
  10. Barz B, Denzler J. Do We Train on Test Data? Purging CIFAR of Near-Duplicates. Journal of Imaging. 2020;6(6):1–8. Available from: https://dx.doi.org/10.3390/jimaging6060041
  11. Mohandas S, Manwani N, Dhulipudi DP. Momentum Iterative Gradient Sign Method Outperforms PGD Attacks. In: Proceedings of the 14th International Conference on Agents and Artificial Intelligence. (pp. 913-916) SCITEPRESS - Science and Technology Publications. 2022.
  12. Goldblum M, Tsipras D, Xie C, Chen X, Schwarzschild A, Song D, et al. Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2023;45(2):1563–1580. Available from: https://dx.doi.org/10.1109/tpami.2022.3162397
  13. Gao L, Zhang Q, Song J, Liu X, Shen HT. Patch-Wise Attack for Fooling Deep Neural Network. In: European Conference on Computer Vision – ECCV 2020, Lecture Notes in Computer Science. (Vol. 12373, pp. 307-322) Springer, Cham. 2020.
  14. Chiang PY, Geiping J, Goldblum M, Goldstein T, Ni R, Reich S, et al. Witchcraft: Efficient PGD Attacks with Random Step Size. In: ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). (pp. 3747-3751) IEEE. 2020.

Copyright

© 2024 Sanapala & Gondi. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)

DON'T MISS OUT!

Subscribe now for latest articles and news.