• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology

Article

Indian Journal of Science and Technology

Year: 2022, Volume: 15, Issue: 7, Pages: 300-308

Original Article

A Credit Scoring Heterogeneous Ensemble Model Using Stacking and Voting

Received Date:15 September 2021, Accepted Date:28 January 2022, Published Date:23 February 2022

Abstract

Background/Objectives: Recent studies emphasized on using ensemble models over single ones to solve credit scoring problems. The objective of this study is to build a heterogeneous ensemble classifier model with an improved classification accuracy. Methods: This study focuses on developing a heterogeneous ensemble classifier using Logistic Regression, K-nearest neighbor, Decision tree, Random Forest, Naïve Base and Support vector machine as base classifiers and Random Forest, Logistic Regression and Support vector machine as meta-classifiers. The proposed model is built using these six base classifiers for ensemble aggregation. A feature selection algorithm based on the random forest technique is used for selecting the best features. A stacking and voting method are used for building ensemble model. Findings: The ensemble classifier gives superior predictive performance than single classifiers SVM, DT, RF, NB, KNN and LR with an accuracy of 91.56% for Australian dataset and 84.35% for German dataset. Novelty: The proposed model uses stacking and majority voting method for ensemble classification. Initially, stacking is applied to the base classifiers. This is done in two levels. First the training dataset is split into 10 folds for cross validation. The output of each classifier is taken, and the dataset is updated with the meta-features. In the second level, three meta-classifiers (MC), namely LR, SVM and RF are used. Majority voting is applied to the output of these meta-classifiers for the prediction.

Keywords: Credit scoring; ensemble model; SVM; DT; RF; NB; KNN; LR

References

  1. Nalić J, Martinović G, Žagar D. New hybrid data mining model for credit scoring based on feature selection algorithm and ensemble classifiers. Advanced Engineering Informatics. 2020;45:101130. Available from: https://dx.doi.org/10.1016/j.aei.2020.101130
  2. Xia Y, Zhao J, He L, Li Y, Niu M. A novel tree-based dynamic heterogeneous ensemble method for credit scoring. Expert Systems with Applications. 2020;159:113615. Available from: https://dx.doi.org/10.1016/j.eswa.2020.113615
  3. Rojarath A, , Songpan W. Probability-Weighted Voting Ensemble Learning for Classification Model. Journal of Advances in Information Technology. 2020;11(4):217–227. Available from: https://dx.doi.org/10.12720/jait.11.4.217-227
  4. Zhang T, Chi G. A heterogeneous ensemble credit scoring model based on adaptive classifier selection: An application on imbalanced data. International Journal of Finance & Economics. 2021;26(3):4372–4385. Available from: https://dx.doi.org/10.1002/ijfe.2019
  5. Bao W, Lianju N, Yue K. Integration of unsupervised and supervised machine learning algorithms for credit risk assessment. Expert Systems with Applications. 2019;128:301–315. Available from: https://dx.doi.org/10.1016/j.eswa.2019.02.033
  6. Zhang W, Yang D, Zhang S, Ablanedo-Rosas JH, Wu X, Lou Y. A novel multi-stage ensemble model with enhanced outlier adaptation for credit scoring. Expert Systems with Applications. 2021;165:113872. Available from: https://dx.doi.org/10.1016/j.eswa.2020.113872
  7. Kunniu Z, Zhang Y, Liu R, Li. Resampling ensemble model based on data distribution for imbalanced credit risk evaluation in p2p lending. Information science. 2020;V(536):124–134. Available from: https://doi.org/10.1016/j.ins.2020.05.040
  8. Sivasankar E, Selvi C, Mahalakshmi S. Rough set-based feature selection for credit risk prediction using weight-adjusted boosting ensemble method. Soft Computing. 2020;24(6):3975–3988. Available from: https://dx.doi.org/10.1007/s00500-019-04167-0
  9. Jin Y, Zhang W, Wu X, Liu Y, Hu Z. A Novel Multi-Stage Ensemble Model With a Hybrid Genetic Algorithm for Credit Scoring on Imbalanced Data. IEEE Access. 2021;9:143593–143607. Available from: https://dx.doi.org/10.1109/access.2021.3120086
  10. Dzelihodz C, Donko D, Kevric J. Improved credit scoring model based on bagging neural network. International Journal of Information Technology and Decision Making. 2018;17:1725–1741. Available from: https://doi.org/10.1142/S0219622018500293
  11. Wang Z, Jiang C, Ding Y, Lyu X, Liu Y. A Novel behavioral scoring model for estimating probability of default over time in peer-to-peer lending. Electronic Commerce Research and Applications. 2018;27:74–82. Available from: https://dx.doi.org/10.1016/j.elerap.2017.12.006
  12. Wei S, Yang D, Zhang W, Zhang S. A Novel Noise-Adapted Two-Layer Ensemble Model for Credit Scoring Based on Backflow Learning. IEEE Access. 2019;7:99217–99230. Available from: https://dx.doi.org/10.1109/access.2019.2930332
  13. He H, Zhang W, Zhang S. A novel ensemble method for credit scoring: Adaption of different imbalance ratios. Expert Systems with Applications. 2018;98(98):105–117. Available from: https://dx.doi.org/10.1016/j.eswa.2018.01.012
  14. Dietterich TG. Machine-learning research: Four current directions. AI Magazine. 1997;18:96–136. Available from: https://doi.org/10.1609/aimag.v18i4.1324
  15. Ali L, Niamat A, Khan JA, Golilarz NA, Xingzhong X, Noor A, et al. An Optimized Stacked Support Vector Machines Based Expert System for the Effective Prediction of Heart Failure. IEEE Access. 2019;7:54007–54014. Available from: https://dx.doi.org/10.1109/access.2019.2909969
  16. Parvin H, MirnabiBaboli M, Alinejad-Rokny H. Proposing a classifier ensemble framework based on classifier selection and decision tree. Engineering Applications of Artificial Intelligence. 2015;37:34–42. Available from: https://dx.doi.org/10.1016/j.engappai.2014.08.005
  17. Ala'raj M, Abbod MF. A new hybrid ensemble credit scoring model based on classifiers consensus system approach. Expert Systems with Applications. 2016;64:36–55. Available from: https://dx.doi.org/10.1016/j.eswa.2016.07.017
  18. Kuncheva LI. Combining Pattern Classifiers: Methods and Algorithms. John Wiley & Sons. 2004.
  19. Marqués AI, García V, Sánchez JS. Exploring the behaviour of base classifiers in credit scoring ensembles. Expert Systems with Applications. 2012;39(11):10244–10250. Available from: https://dx.doi.org/10.1016/j.eswa.2012.02.092
  20. Sang HV, Nam NH, Nhan ND. A Novel Credit Scoring Prediction Model based on Feature Selection Approach and Parallel Random Forest. Indian Journal of Science and Technology. 2016;9(20). Available from: https://dx.doi.org/10.17485/ijst/2016/v9i20/92299

Copyright

© 2022 Anil Kumar et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Published By Indian Society for Education and Environment (iSee)

DON'T MISS OUT!

Subscribe now for latest articles and news.