• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology


Indian Journal of Science and Technology

Year: 2023, Volume: 16, Issue: 34, Pages: 2789-2795

Original Article

RESTful Service based Software Defect Prediction using ML Algorithms

Received Date:05 June 2023, Accepted Date:14 August 2023, Published Date:15 September 2023


Objectives: To present a suitable RESTful service-based software defect prediction approach that employs Machine Learning (ML) algorithms to identify software defects. Methods: The proposed approach is designed to provide a flexible solution for predicting software defects using various machine-learning techniques. It leverages RESTful web service-based class-level software metrics, including code complexity metrics, size metrics, coupling metrics, and cohesion metrics, and uses these metrics to train various ML models, such as Logistic Regression, Random Forest Classifier, LightGBM, XGBoost, and Support Vector Machines. Findings: We have proposed a correlation co-efficient method for feature selection and reduced it from 98 features to 25 features. With the granularity of class-level metrics of the RESTful service-based Elastic Search Engine’s dataset, we achieved the highest F-measure score of 0.677 using the LightGBM Machine Learning model. The existing work was done using the 10-fold cross-validation and achieved an F-measure of 0.5817 using the Decision Table model. Novelty: Most of the existing works carried out by various researchers using publicly available NASA PROMISE datasets which were generated long ago on legacy programming languages and further no updates were taken into consideration. This could lead to data source bias, meaning the findings and models developed may not be representative of software systems from different domains or industries. The proposed work carried out is using a newly generated RESTful software defects-based dataset and publicly available: Bug Hunter Dataset. The Bug Hunter dataset aims to cover a wide range of projects and software systems from different domains and industries. This diversity allows researchers to develop defect prediction models that are more generalizable and applicable to real-world scenarios and specific organizations or domains. Apart from the original author, as of now, no one used this dataset for software defect prediction. In the proposed work we have used one of the Bug Hunter Datasets called Elastic Search Engine — a RESTful Service-based software. We have applied different feature selection methods and achieved the best results using the Correlation Coefficient technique and achieved the best F-Measure of 0.677 using LightGBM with ahold-out validation approach whereas, in the existing work, the 10-Fold crossvalidation technique was used and achieved 0.5817 as the highest F-measure using the Decision Table machine learning model. There is future scope for working with other Machine Learning Models for exhaustive comparison with the proposed model.

Keywords: Software Defect Prediction; Feature Reduction; Correlation Coefficient; Machine Learning; RESTful Service Software; LightGBM; Random Forest; SVM


  1. Ferenc R, Gyimesi P, Gyimesi G, Tóth Z, Gyimóthy T. An automatically created novel bug dataset and its validation in bug prediction. Journal of Systems and Software. 2020;169:1–20. Available from: https://doi.org/10.1016/j.jss.2020.110691
  2. Matloob F, Ghazal TM, Taleb N, Aftab S, Ahmad M, Khan MA, et al. Software Defect Prediction Using Ensemble Learning: A Systematic Literature Review. IEEE Access. 2021;9:98754–98771. Available from: https://doi.org/10.1109/ACCESS.2021.3095559
  3. Yang Z, Jin C, Zhang Y, Wang J, Yuan B, Li H. Software Defect Prediction: An Ensemble Learning Approach. In: International Conference on Computer, Big Data and Artificial Intelligence (ICCBDAI 2021), Journal of Physics: Conference Series. Beihai, China, 12/11/2021 - 14/11/2021 . IOP Publishing. 2171:1–7.
  4. Rathore SS, Kumar S. An Approach for the Prediction of Number of Software Faults Based on the Dynamic Selection of Learning Techniques. IEEE Transactions on Reliability. 2019;68(1):216–236. Available from: https://doi.org/10.1109/TR.2018.2864206
  5. Tóth Z, Gyimesi P, Ferenc R. A Public Bug Database of GitHub Projects and Its Application in Bug Prediction. In: ICCSA 2016: Computational Science and Its Applications, Lecture Notes in Computer Science book series. Springer International Publishing. 9789:625–638.
  6. Shamrat FJM, Azam S, Karim A, Ahmed K, Bui FM, Boer FD. High-precision multiclass classification of lung disease through customized MobileNetV2 from chest X-ray images. Computers in Biology and Medicine. 2023;155:1–14. Available from: https://doi.org/10.1016/j.compbiomed.2023.106646


© 2023 Ponnala & Reddy. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)


Subscribe now for latest articles and news.