• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology


Indian Journal of Science and Technology

Year: 2022, Volume: 15, Issue: 39, Pages: 1978-1986

Original Article

An Ensemble Learning Approach for Effective Prediction of Diabetes Mellitus Using Hard Voting Classifier

Received Date:23 July 2022, Accepted Date:16 September 2022, Published Date:15 October 2022


Objectives: People all across the world are afflicted by the deadly ailment known as diabetes. Diabetes is a terrible condition characterized by high blood glucose levels. This chronic condition is one of the leading causes of death for people worldwide. Early identification and prediction of diabetes can be aided by machine learning techniques. The purpose of this study is to use an ensemble of machine learning algorithms to predict diabetes efficiently in order to help the patients suffering from this lethal disease. Methods: The existing methods use a single model to predict diabetes, which may have an impact on accuracy because no one model can fit all datasets. Therefore we propose a robust model based on ensemble learning using hard voting classifier. Both the Pima Indians Diabetes dataset and the Early Stage Diabetes Risk Prediction Dataset, which collect data on people with and without diabetes, were tested. For classification, the proposed ensemble hard voting classifier uses a combination of three machine learning algorithms namely logistic regression, decision tree, and support vector machine. Findings: On the PIMA diabetes dataset, the proposed ensemble approach achieves the highest accuracy, precision, recall, and F1 score value of 81.17%, while on the Early Stage Diabetes Risk Prediction Dataset, it achieves the highest accuracy, precision, recall, and F1 score value of 94.23%. Novelty: The proposed methodology was experimentally tested using the state-of-the-art technology and basic classifiers such as K-Nearest Neighbor, Logistic Regression, Support Vector Machine, and Random Forest. The results are validated by computing the confusion matrix and ROC for each classier type.

Keywords: Diabetes Detection; Machine Learning; Supervised Classification; Ensemble Classification; Hard Voting Classifier


  1. Arjun P, Verma J. Methods for detection of Diabetes Mellitus using Machine Learning Techniques. Journal of Multidisciplinary Engineering Science and Technology (JMEST). 2020;7(11):12948–12956.
  2. Azbeg K, Boudhane M, Ouchetto O, Andaloussi SJ. Diabetes emergency cases identification based on a statistical predictive model. Journal of Big Data. 2022;9(1):31. Available from: https://doi.org/10.1186/s40537-022-00582-7
  3. Saeedi P, Petersohn I, Salpea P, Malanda B, Karuranga S, Unwin N, et al. Global and regional diabetes prevalence estimates for 2019 and projections for 2030 and 2045: Results from the International Diabetes Federation Diabetes Atlas, 9th edition. Diabetes Research and Clinical Practice. 2019;157:107843. Available from: https://doi.org/10.1016/j.diabres.2019.107843
  4. Merad-Boudia HN, Dali-Sahi M, Kachekouche Y, Dennouni-Medjati N. Hematologic disorders during essential hypertension. Diabetes & Metabolic Syndrome: Clinical Research & Reviews. 2019;13(2):1575–1579. Available from: https://doi.org/10.1016/j.dsx.2019.03.011
  5. Mushtaq Z, Ramzan MF, Ali S, Baseer S, Samad A, Husnain M. Voting Classification-Based Diabetes Mellitus Prediction Using Hypertuned Machine-Learning Techniques. Mobile Information Systems. 2022;2022:1–16. Available from: https://doi.org/10.1155/2022/6521532
  6. Morgan-Benita JA, Galván-Tejada CE, Cruz M, Galván-Tejada JI, Gamboa-Rosales H, Arceo-Olague JG, et al. Hard Voting Ensemble Approach for the Detection of Type 2 Diabetes in Mexican Population with Non-Glucose Related Features. Healthcare. 2022;10(8):1362. Available from: https://doi.org/10.3390/ healthcare10081362
  7. Kopitar L, Kocbek P, Cilar L, Sheikh A, Stiglic G. Early detection of type 2 diabetes mellitus using machine learning-based prediction models. Scientific Reports. 2020;10(1):11981. Available from: https://doi.org/10.1038/s41598-020-68771-z
  8. Tigga NP, Garg S. Predicting Type 2 Diabetes Using Logistic Regression. In: Lecture Notes in Electrical Engineering. (pp. 491-500) Springer Singapore. 2021.
  9. Zou Q, Qu K, Luo Y, Yin D, Ju Y, Tang H. Predicting Diabetes Mellitus With Machine Learning Techniques. Frontiers in Genetics. 2018;9:515. Available from: https://doi.org/10.3389/fgene.2018.00515
  10. Islam MMF, Ferdousi R, Rahman S, Bushra HY. Likelihood Prediction of Diabetes at Early Stage Using Data Mining Techniques. Computer Vision and Machine Intelligence in Medical Image Analysis. 2020;p. 113–125. Available from: https://doi.org/10.1007/978-981-13-8798-2_12
  11. Alam TM, Iqbal MA, Ali Y, Wahab A, Ijaz S, Baig TI, et al. A model for early prediction of diabetes. Informatics in Medicine Unlocked. 2019;16:100204. Available from: https://doi.org/10.1016/j.imu.2019.100204
  12. Abidin NZ, Ritahani A, A. N. Performance Analysis of Machine Learning Algorithms for Missing Value Imputation. International Journal of Advanced Computer Science and Applications. 2018;9(6):442–447. Available from: https://doi.org/10.14569/IJACSA.2018.090660
  13. Neamah M, Wahhab. Utilizing the Logistic Regression Model in Analyzing the Categorical Data of Economic Effects. Turkish Journal of Computer and Mathematics Education (TURCOMAT). 2021;12:638–646. Available from: https://doi.org/10.17762/turcomat.v12i4.547
  14. Charbuty B, Abdulazeez A. Classification Based on Decision Tree Algorithm for Machine Learning. Journal of Applied Science and Technology Trends. 2021;2(01):20–28. Available from: https://doi.org/10.38094/jastt20165
  15. Fregoso-Aparicio L, Noguez J, Montesinos L, García-García JA. Machine learning and deep learning predictive models for type 2 diabetes: a systematic review. Diabetology & Metabolic Syndrome. 2021;13(1):148. Available from: https://doi.org/10.1186/s13098-021-00767-9
  16. Muhammad LJ, Algehyne EA, Usman SS. Predictive Supervised Machine Learning Models for Diabetes Mellitus. SN Computer Science. 2020;1(5):1–10. Available from: https://doi.org/10.1007/s42979-020-00250-8
  17. Kumari S, Kumar D, Mittal M. An ensemble approach for classification and prediction of diabetes mellitus using soft voting classifier. International Journal of Cognitive Computing in Engineering. 2021;2:40–46. Available from: https://doi.org/10.1016/j.ijcce.2021.01.001
  18. Fauzi MA, Bours P. Ensemble Method for Sexual Predators Identification in Online Chats. 2020 8th International Workshop on Biometrics and Forensics (IWBF). 2020. Available from: https://doi.org/10.1109/IWBF49977.2020.9107945


© 2022 Atif et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)


Subscribe now for latest articles and news.