Indian Journal of Science and Technology
Year: 2024, Volume: 17, Issue: 4, Pages: 343-351
Original Article
Digambar B Uphade1*, Aniket A Muley2, Swapnil V Chalwadi3
1Department of Statistics, KRT Arts, BH Commerce and AM Science College, Nashik, 422002, Maharashtra, India
2School of Mathematical Sciences, Swami Ramanand Teerth Marathwada University, Nanded, 431606, Maharashtra, India
3School of Liberal Arts, Dr. Vishwanath Karad MIT World Peace University, Pune, 411038, Maharashtra, India
*Corresponding Author
Email: [email protected]
Received Date:24 November 2023, Accepted Date:28 December 2023, Published Date:20 January 2024
Objectives: In the current financial landscape, banks confront with the significant challenges in effectively managing credit risk and ensuring the stability of their loan portfolios. It is imperative for the banks to ensure an accurate assessment of loan default possibility as a critical aspect of their overall risk management process. The study aims to develop a predictive model that is suitable for accurately identifying potential defaulters. Methods: Investigation employs a diverse range of machine learning techniques, including Random Forest, Logistic Regression, Decision Tree, k-Nearest Neighbour, Support Vector Machine, XG Boost, Ada Boost, and Gradient Boosting Machines, to evaluate loan default probabilities in both balanced and imbalanced data environments. The study's methodology involved the application of these algorithms to datasets typically characterized by imbalance, a frequent occurrence in financial risk assessments. We addressed this challenge by implementing resampling techniques, thereby enhancing the representativeness and accuracy of findings. Findings: Findings of this study indicate that in imbalanced datasets, the Random Forest algorithm emerged as the most accurate, registering an impressive 0.91 accuracy score. Comparable efficacy was noted in Logistic Regression and SVM, each achieving 0.90 and 0.91 accuracy scores respectively. Remarkably, in balanced datasets, the Random Forest model demonstrated a perfect accuracy score of 1.00, surpassing other models. This model consistently excelled in precision, recall, and F1-score metrics across different data scenarios. Novelty: This study highlights the Random Forest classifier as an optimal tool for predicting loan defaults, marking a significant advancement over existing methodologies. The outcomes of this research provide crucial insights for financial institutions in enhancing their loan risk assessments, thus enabling more precise and informed decision-making in lending processes.
Keywords: Credit risk, Machine learning, Random forest, Loan defaulter, Classification
© 2024 Uphade et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)
Subscribe now for latest articles and news.