Indian Journal of Science and Technology
Year: 2022, Volume: 15, Issue: 7, Pages: 300-308
Original Article
C J Anil Kumar1,2*, B K Raghavendra3, S Raghavendra4
1Research Scholar, Department of Computer Science and Engineering Research Centre, BGS Institute of Technology, B G Nagara, Mandya, Karnataka, India
2Associate Professor, Department of Computer Science and Engineering, ATME College of Engineering, Mysuru, Karnataka, India
3Professor & Head, Department of Computer Science and Engineering, BGS Institute of Technology, B G Nagara, Mandya, Karnataka, India
4Associate Professor, Department of Computer Science and Engineering, School of Engineering and Technology, CHRIST Deemed to be University, Bengaluru, Karnataka, India
*Corresponding Author
Email: [email protected]
Received Date:15 September 2021, Accepted Date:28 January 2022, Published Date:23 February 2022
Background/Objectives: Recent studies emphasized on using ensemble models over single ones to solve credit scoring problems. The objective of this study is to build a heterogeneous ensemble classifier model with an improved classification accuracy. Methods: This study focuses on developing a heterogeneous ensemble classifier using Logistic Regression, K-nearest neighbor, Decision tree, Random Forest, Naïve Base and Support vector machine as base classifiers and Random Forest, Logistic Regression and Support vector machine as meta-classifiers. The proposed model is built using these six base classifiers for ensemble aggregation. A feature selection algorithm based on the random forest technique is used for selecting the best features. A stacking and voting method are used for building ensemble model. Findings: The ensemble classifier gives superior predictive performance than single classifiers SVM, DT, RF, NB, KNN and LR with an accuracy of 91.56% for Australian dataset and 84.35% for German dataset. Novelty: The proposed model uses stacking and majority voting method for ensemble classification. Initially, stacking is applied to the base classifiers. This is done in two levels. First the training dataset is split into 10 folds for cross validation. The output of each classifier is taken, and the dataset is updated with the meta-features. In the second level, three meta-classifiers (MC), namely LR, SVM and RF are used. Majority voting is applied to the output of these meta-classifiers for the prediction.
Keywords: Credit scoring; ensemble model; SVM; DT; RF; NB; KNN; LR
© 2022 Anil Kumar et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Published By Indian Society for Education and Environment (iSee)
Subscribe now for latest articles and news.