Indian Journal of Science and Technology
DOI: 10.17485/IJST/v16i34.1376
Year: 2023, Volume: 16, Issue: 34, Pages: 2789-2795
Original Article
Ramesh Ponnala1,2*, C R K Reddy3
1Research Scholar, UCE, Osmania University, Hyderabad, Telangana, India
2Asst. Professor, Department of MCA, Chaitanya Bharathi Institute of Technology (A), Gandipet, Hyderabad, 500075, Telangana, India
3Professor and Head, Department of CSE, Mahatma Gandhi Institute of Technology (A), Gandipet, Hyderabad, 500075, Telangana, India
*Corresponding Author
Email: [email protected]
Received Date:05 June 2023, Accepted Date:14 August 2023, Published Date:15 September 2023
Objectives: To present a suitable RESTful service-based software defect prediction approach that employs Machine Learning (ML) algorithms to identify software defects. Methods: The proposed approach is designed to provide a flexible solution for predicting software defects using various machine-learning techniques. It leverages RESTful web service-based class-level software metrics, including code complexity metrics, size metrics, coupling metrics, and cohesion metrics, and uses these metrics to train various ML models, such as Logistic Regression, Random Forest Classifier, LightGBM, XGBoost, and Support Vector Machines. Findings: We have proposed a correlation co-efficient method for feature selection and reduced it from 98 features to 25 features. With the granularity of class-level metrics of the RESTful service-based Elastic Search Engine’s dataset, we achieved the highest F-measure score of 0.677 using the LightGBM Machine Learning model. The existing work was done using the 10-fold cross-validation and achieved an F-measure of 0.5817 using the Decision Table model. Novelty: Most of the existing works carried out by various researchers using publicly available NASA PROMISE datasets which were generated long ago on legacy programming languages and further no updates were taken into consideration. This could lead to data source bias, meaning the findings and models developed may not be representative of software systems from different domains or industries. The proposed work carried out is using a newly generated RESTful software defects-based dataset and publicly available: Bug Hunter Dataset. The Bug Hunter dataset aims to cover a wide range of projects and software systems from different domains and industries. This diversity allows researchers to develop defect prediction models that are more generalizable and applicable to real-world scenarios and specific organizations or domains. Apart from the original author, as of now, no one used this dataset for software defect prediction. In the proposed work we have used one of the Bug Hunter Datasets called Elastic Search Engine — a RESTful Service-based software. We have applied different feature selection methods and achieved the best results using the Correlation Coefficient technique and achieved the best F-Measure of 0.677 using LightGBM with ahold-out validation approach whereas, in the existing work, the 10-Fold crossvalidation technique was used and achieved 0.5817 as the highest F-measure using the Decision Table machine learning model. There is future scope for working with other Machine Learning Models for exhaustive comparison with the proposed model.
Keywords: Software Defect Prediction; Feature Reduction; Correlation Coefficient; Machine Learning; RESTful Service Software; LightGBM; Random Forest; SVM
© 2023 Ponnala & Reddy. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)
Subscribe now for latest articles and news.