• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology

Article

Indian Journal of Science and Technology

Year: 2020, Volume: 13, Issue: 22, Pages: 2189-2202

Original Article

Voting-Boosting: A novel machine learning ensemble for the prediction of Infants' Data

Received Date:01 May 2020, Accepted Date:15 June 2020, Published Date:24 June 2020

Abstract

Background/Objectives: Owing to the continuous increase of electronic records and recent advances in machine learning, various automated disease diagnosis tools have been developed and proposed in healthcare sector. In the present study, an ensemble methodology using voting and boosting techniques has been proposed for optimal selection of features and prediction of infants' data of India. Methods/Analysis: For feature selection, the best-first search algorithm of wrapper technique has been used in addition to votingboosting. The proposed ensemble consists of combination of heterogeneous classifiers including Random Forest, J48, JRip, CART and Stochastic Gradient Descent (SGD). The effectiveness of the proposed ensemble and single classifiers have been investigated in terms of classification accuracy, precision, f-measure, recall, MCC and PRC area using varied k-fold cross validation. Findings: The results depicted that the proposed Voting-Boosting ensemble (k=15) outperforms the individual classifiers using selected features. Applications / Improvements: The proposed Voting-Boosting ensemble can be extended by using more state-of-the art classification approaches and further utilized for other healthcare datasets for enhancing the performance.

Keywords: Machine learning; ensemble; feature selection; wrapper; voting and boosting

References

  1. Deb AK, Dutta S, Hnichho C, Vanlalpeki M, Phosa HT, Rakhu K, et al. A case control study investigating factors associated with high infant death in Saiha district of Mizoram, India bordering Myanmar. BMC Pediatrics. 2017;17(1):1–9. Available from: https://dx.doi.org/10.1186/s12887-017-0778-z
  2. Francis MR, Nohynek H, Larson H, Balraj V, Mohan VR, Kang G, et al. Factors associated with routine childhood vaccine uptake and reasons for non-vaccination in India: 1998–2008. Vaccine. 2018;36(44):6559–6566. Available from: https://dx.doi.org/10.1016/j.vaccine.2017.08.026
  3. Guerra AB, Guerra LM, Probst LF, Gondinho BVC, Ambrosano GMB, Melo EA, et al. Can the primary health care model affect the determinants of neonatal, post-neonatal and maternal mortality? A study from Brazil. BMC Health Services Research. 2019;19(1). Available from: https://dx.doi.org/10.1186/s12913-019-3953-0
  4. Lassi ZS, Mallick D, Das JK, Mal L, Salam RA, Bhutta ZA. Essential interventions for child health. Reproductive Health. 2014;11(Suppl 1):S4. Available from: https://dx.doi.org/10.1186/1742-4755-11-s1-s4
  5. Shrivastwa N, Gillespie BW, Kolenic GE, Lepkowski JM, Boulton ML. Predictors of Vaccination in India for Children Aged 12–36 Months. American Journal of Preventive Medicine. 2015;49(6):S435–S444. doi: 10.1016/j.amepre.2015.05.008
  6. Amin MS, Chiam YK, Varathan KD. Identification of significant features and data mining techniques in predicting heart disease. Telematics and Informatics. 2019;36:82–93. Available from: https://dx.doi.org/10.1016/j.tele.2018.11.007
  7. Panthong R, Srivihok A. Wrapper Feature Subset Selection for Dimension Reduction Based on Ensemble Learning Algorithm. Procedia Computer Science. 2015;72:162–169. Available from: https://dx.doi.org/10.1016/j.procs.2015.12.117
  8. Masoudi-Sobhanzadeh Y, Motieghader H, Masoudi-Nejad A. FeatureSelect: a software for feature selection based on machine learning approaches. BMC Bioinformatics. 2019;20(1). Available from: https://dx.doi.org/10.1186/s12859-019-2754-0
  9. Mario WL, Moreira J, Rodrigues G, Marcondes AJV, Neto V, Furtado. 2018. Available from: https://doi.org/10.5753/sbcas.2018.3671
  10. Kabir MF, Ludwig AS. Enhancing the Performance of Classification Using Super Learning. Data-Enabled Discovery and Applications. 2019;3(1). Available from: https://dx.doi.org/10.1007/s41688-019-0030-0
  11. Bashir S, Qamar U, Khan FH. BagMOOV: A novel ensemble for heart disease prediction bootstrap aggregation with multi-objective optimized voting. Australasian Physical & Engineering Sciences in Medicine. 2015;38:305–323. Available from: https://dx.doi.org/10.1007/s13246-015-0337-6
  12. Huang MW, Chen CW, Lin WC, Ke SW, Tsai CF. SVM and SVM Ensembles in Breast Cancer Prediction. PLOS ONE. 2017;12(1):e0161501. Available from: https://dx.doi.org/10.1371/journal.pone.0161501
  13. Cong J, Wei B, He Y, Yin Y, Zheng Y. A Selective Ensemble Classification Method Combining Mammography Images with Ultrasound Images for Breast Cancer Diagnosis. Computational and Mathematical Methods in Medicine. 2017. Available from: https://doi.org/10.1155/2017/4896386
  14. Das R, Sengur A. Evaluation of ensemble methods for diagnosing of valvular heart disease. Expert Systems with Applications. 2010;37(7):5110–5115. Available from: https://dx.doi.org/10.1016/j.eswa.2009.12.085
  15. Rijn JNv, Holmes G, Pfahringer B, Vanschoren J. The online performance estimation framework: heterogeneous ensemble learning for data streams. Machine Learning. 2018;107(1):149–176. Available from: https://dx.doi.org/10.1007/s10994-017-5686-9
  16. Santos V, Datia N, Pato MPM. Ensemble Feature Ranking Applied to Medical Data. Procedia Technology. 2014;17:223–230. Available from: https://dx.doi.org/10.1016/j.protcy.2014.10.232
  17. Khan Z, Gul A, Perperoglou A, Miftahuddin M, Mahmoud O, Adler W, et al. Ensemble of optimal trees, random forest and random projection ensemble classification. Advances in Data Analysis and Classification. 2020;14:97–116. Available from: https://dx.doi.org/10.1007/s11634-019-00364-9
  18. Large J, Lines J, Bagnall A. A probabilistic classifier ensemble weighting scheme based on cross-validated accuracy estimates. Data Mining and Knowledge Discovery. 2019;33(6):1674–1709. Available from: https://dx.doi.org/10.1007/s10618-019-00638-y
  19. Boughaci D, Alkhawaldeh AAs. Three local search-based methods for feature selection in credit scoring. Vietnam Journal of Computer Science. 2018;5(2):107–121. Available from: https://dx.doi.org/10.1007/s40595-018-0107-y
  20. Rajab DK. New Hybrid Features Selection Method: A Case Study on Websites Phishing. Hindawi Security and Communication Networks. 2017. Available from: https://doi.org/10.1155/2017/9838169
  21. Miao J, Niu L. A Survey on Feature Selection. Procedia Computer Science. 2016;91:919–926. Available from: https://dx.doi.org/10.1016/j.procs.2016.07.111
  22. Zhao S, Zhang Y, Xu H, Han T. 2019. Available from: https://doi.org/10.1155/2019/4318463
  23. Adnan OM, Abuassba D, Zhang X, Luo A, Shaheryar H, Ali. 2017. Available from: https://doi.org/10.1155/2017/3405463
  24. Nagi S, Bhattacharyya DK. Classification of microarray cancer data using ensemble approach. Network Modeling Analysis in Health Informatics and Bioinformatics. 2013;2(3):159–173. doi: 10.1007/s13721-013-0034-x
  25. Wu H, Yang S, Huang Z, He J, Wang X. Type 2 diabetes mellitus prediction model based on data mining. Informatics in Medicine Unlocked. 2018;10:100–107. Available from: https://dx.doi.org/10.1016/j.imu.2017.12.006
  26. Cawley GC, Talbot NLC. Efficient approximate leave-one-out cross-validation for kernel logistic regression. Machine Learning. 2008;71(2-3):243–264. Available from: https://dx.doi.org/10.1007/s10994-008-5055-9
  27. Bashir S, Qamar U, Khan FH. A Multicriteria Weighted Vote-Based Classifier Ensemble for Heart Disease Prediction. Computational Intelligence. 2015. Available from: https://doi.org/10.1111/coin.12070
  28. Pu L, Naderi M, Liu T, Wu HC, Mukhopadhyay S, Brylinski M. eToxPred: a machine learning-based approach to estimate the toxicity of drug candidates. BMC Pharmacology and Toxicology. 2019;20(1). Available from: https://dx.doi.org/10.1186/s40360-018-0282-6

Copyright

© 2020 , Mansotra, Kour, Kumar. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Published By Indian Society for Education and Environment (iSee)

DON'T MISS OUT!

Subscribe now for latest articles and news.