Indian Journal of Science and Technology

Year: 2022, Volume: 15, Issue: 6, Pages: 237-242

Original Article

Impact of Unbalanced Classification on the Performance of Software Defect Prediction Models

Received Date:25 November 2021, Accepted Date:23 January 2022, Published Date:16 February 2022


Objectives: To propose a suitable imbalanced data classification model to split the dataset into two new datasets and to test the created imbalanced dataset by the prediction models. Methods: The imbalance defect data sets are taken from the PROMISE library and used for the performance evaluation. The results clearly demonstrate that the performance of three existing prediction classifier models, K-Nearest Neighbor (KNN), Naive Bayes (NB), and Back Propagation (BPN), is very susceptible in terms of unbalance of classification, while Support Vector Machine (SVM) and Extreme Learning Machine (ELM) are more stable. Findings: The outcome of this research reveals that applied SVM and ELM machine learning models improves the performance in defect prediction and records 29% more than KNN, and 19% more than NB and BPN. Novelty: According to the findings of a comprehensive study, the proposed machine learning new classification imbalance impact analysis method outperforms the existing ones in order to transform the original imbalance data set into a new data set with an increasing imbalance rate and be able to select models to evaluate different predictions on the new data set.

Keywords: Software Fault Prediction Model; Imbalance Problem Classification; Artificial Intelligence; Smart Debugging; Unbalanced Classification


