• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology

Article

Indian Journal of Science and Technology

Year: 2021, Volume: 14, Issue: 32, Pages: 2607-2615

Original Article

Optimized Stacking Ensemble (OSE) for Credit Card Fraud Detection using Synthetic Minority Oversampling Model

Received Date:10 May 2021, Accepted Date:03 September 2021, Published Date:01 October 2021

Abstract

Objectives: Credit fraud is a global threat to financial institutions due to specific challenges like imbalanced datasets and hidden patterns in real-life scenarios. The objective of this study is to propose a model that effectively identifies fraudulent transactions. Methods: Methods such as Synthetic Minority Oversampling Technique (SMOTE) and Generative Adversarial Networks (GAN) that artificially generate synthetic data are used in this paper to approximate the distribution of data among the two classes in the original dataset. After balancing the dataset, the individual models Multi-Layer Perceptron (MLP), k- Nearest Neighbors algorithm (kNN) and Support Vector Machine (SVM) are trained on the augmented dataset to establish an initial improvement at the data level. These base-classifiers are further incorporated into the Optimized Stacked Ensemble (OSE) learning process to fit the meta-classifier which creates an effective predictive model for fraud detection. All base-classifiers and the final Optimized Stacked Ensemble (OSE) have been implemented to critically assess and evaluate their performances. Findings: Empirical results obtained in this paper show that the quality of the final dataset is considerably improved when Synthetic Minority Oversampling Technique (SMOTE) and Generative Adversarial Networks (GAN) are used as oversampling algorithms. The Multi-Layer Perceptron model showed an increase of 10% in the F1 Score while kNN and SVM showed an increase of 3% each. The optimized model is built using a Stacking Classifier that combines the GAN-improved Multi-Perceptron Model with the other standard classification models such as KNN and SVM. This ensemble outperforms the existing enhanced Multi-Layer Perceptron with near-perfect accuracy (99.86%) and an increase of 16% in F1 Score, resulting in an effective fraud detection mechanism. Novelty: For the current dataset, the Optimized Stacked Ensemble model shows an increase of 16% in F1 Score as compared to the existing Multi-Perceptron model.

Keywords: Ensemble; Credit Card; Fraud Detection; GAN; SMOTE; MLP

References

  1. Sahin Y, Bulkan S, Duman E. A Cost-Sensitive Decision Tree Approach for Fraud Detection. Expert Systems with Applications. 2013;40(15):5916–5923. doi: 10.1016/j.eswa.2013.05.021
  2. Pun JK, Lawryshyn Y. Improving Credit Card Fraud Detection using a Meta-Classification Strategy. International Journal of Computer Applications. 2012;56(10):41–46. doi: 10.5120/8930-3007
  3. Douzas G, Bacao F. Effective Data Generation for Imbalanced Learning Using Conditional Generative Adversarial Networks. Expert Systems with applications. 2018;91:464–71. Available from: https://doi.org/10.1016/j.eswa.2017.09.030
  4. Rezapour M. Anomaly Detection using Unsupervised Methods: Credit Card Fraud Case Study. International Journal of Advanced Computer Science and Applications. 2019;10(11):1–8. Available from: https://thesai.org/Downloads/Volume10No11/Paper_1-Anomaly_Detection_using_Unsupervised_Methods.pdf
  5. Fiore U, Santis AD, Perla F, Zanetti P, Palmieri F. Using Generative Adversarial Networks for Improving Classification Effectiveness in Credit Card Fraud Detection. Information Sciences. 2019;479:448–455. Available from: https://doi.org/10.1016/j.ins.2017.12.030
  6. Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP. SMOTE: Synthetic Minority Over-Sampling Technique. Journal of artificial intelligence research. 2002;16:321–57. doi: 10.5555/1622407.1622416
  7. Lemaître G, Nogueira F, Aridas CK. Imbalanced-learn: A Python Toolbox to Tackle the Curse of Imbalanced Datasets in Machine Learning. The Journal of Machine Learning Research. 2017;18(1):559–63. Available from: https://www.jmlr.org/papers/volume18/16-365/16-365.pdf
  8. Yan Y, Liu R, Ding Z, Du X, Chen J, Zhang Y. A Parameter-Free Cleaning Method for SMOTE in Imbalanced Classification. IEEE Access. 2019;7:23537–23585. doi: 10.1109/ACCESS.2019.2899467
  9. Zhang Y, Liu G, Luan W, Yan C, Jiang C. Application of SIRUS in Credit Card Fraud Detection. In: Computational Data and Social Networks. (pp. 66-78) Springer. 2018.
  10. Lenka SR, Pant M, Barik RK, Patra SS, Dubey H. Investigation into the Efficacy of Various Machine Learning Techniques for Mitigation in Credit Card Fraud Detection. In: Bhateja V, Peng SL, Satapathy SC, Zhang YD., eds. Evolution in Computational Intelligence. (Vol. 1176, pp. 255-254) Springer. 2021.

Copyright

© 2021 Veigas et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)

DON'T MISS OUT!

Subscribe now for latest articles and news.