• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology


Indian Journal of Science and Technology

Year: 2024, Volume: 17, Issue: 6, Pages: 533-547

Original Article

Big Data Analytics for Heart Disease Prediction using Regularized Principal and Quadratic Entropy Boosting

Received Date:18 November 2023, Accepted Date:02 January 2024, Published Date:02 February 2024


Objectives: Over the past few years, there prevails an abundance wealth of big data obtained via patients' electronic health records. One of the leading causes of mortality globally is the cardiovascular disease. Based on the present test and history cardiovascular disease diagnosing of patients can be done. Therefore, early and quick diagnosis can reduce the mortality rate. To address their needs, several machine learning methods have been employed in the recent past in cardiovascular disease diagnosis and prediction. Previous research was also concentrated on acquiring the significant features to heart disease prediction however less importance was given to the time involved and error rate to identifying the strength of these features. Methods: In this work we plan to develop a method called, Regularized Principal Component and Quadratic Weighted Entropy Boosting (RPC-QWEB) for predicting heart disease. Initially in RPC-QWEB, relevant features are selected to avoid missing values in the input database by employing Regularized Principal Component Regressive Feature Selection (RPCRFS). Second, with the obtained dimensionality reduced features, Quadratic Weighted Entropy Boosting Classification (QWEBC) process is carried out to classify the patient data as normal or abnormal. The QWEBC process is an ensemble of several weak classifiers (i.e., Quadratic Classifier). The weak classifier results are combined to form strong classifier and provide final prediction results as normal or abnormal condition with minimal error rate. Findings: Experimental evaluation is carried out on factors with the cardiovascular disease dataset such as heart disease prediction accuracy, heart disease prediction time, sensitivity, error rate with respect to distinct numbers of patient data. The proposed RPC-QWEB method was compared with existing Heart Disease Prediction Framework (HDPF) and Swarm Artificial Neural Network (Swarm-ANN). Novelty: RPC-QWEB method outperforms the conventional learning methods in terms of numerous performance matrices. The RPC-QWEB method produces 3% and 5% increase in terms of accuracy and sensitivity and 7% and 29% reduced prediction time and error rate as compared to the existing benchmark methods. We may use this method to predict the heart disease at early stage there by we can reduce the death rate.

Keywords: Big data, Regularized Principal Component, Quadratic Weighted Entropy Boosting, Regressive Feature Selection, Classification


  1. Nandy S, Adhikari M, Balasubramanian V, Menon VG, Li X, Zakarya M. An intelligent heart disease prediction system based on swarm-artificial neural network. Neural Computing and Applications. 2023;35(20):14723–14737. Available from: https://doi.org/10.1007/s00521-021-06124-1
  2. Ashri SEA, El-Gayar MM, El-Daydamony EM. HDPF: Heart Disease Prediction Framework Based on Hybrid Classifiers and Genetic Algorithm. IEEE Access. 2021;9:146797–146809. Available from: https://doi.org/10.1109/ACCESS.2021.3122789
  3. Mohan SK, CT, Srivastava G. Effective Heart Disease Prediction Using Hybrid Machine Learning Techniques. IEEE Access. 2019;7:81542–81554. Available from: https://doi.org/10.1109/ACCESS.2019.2923707
  4. An Y, Huang N, Chen X, Wu F, Wang J. High-Risk Prediction of Cardiovascular Diseases via Attention-Based Deep Neural Networks. IEEE/ACM Transactions on Computational Biology and Bioinformatics. 2021;18(3):1093–1105. Available from: https://doi.org/10.1109/TCBB.2019.2935059
  5. Dutta A, Batabyal T, Basu M, Acton ST. An efficient convolutional neural network for coronary heart disease prediction. Expert Systems with Applications. 2020;159:113408. Available from: https://doi.org/10.1016/j.eswa.2020.113408
  6. Wong ND, Zhao Y, Xiang P, Coll B, López JAG. Five-Year Residual Atherosclerotic Cardiovascular Disease Risk Prediction Model for Statin Treated Patients With Known Cardiovascular Disease. The American Journal of Cardiology. 2020;137:7–11. Available from: https://doi.org/10.1016/j.amjcard.2020.09.043
  7. Zhou C, Li A, Hou A, Zhang Z, Zhang Z, Dai P, et al. Modeling methodology for early warning of chronic heart failure based on real medical big data. Expert Systems with Applications. 2020;151:113361. Available from: https://doi.org/10.1016/j.eswa.2020.113361
  8. Shankar V, Kumar V, Devagade U, Karanth V, Rohitaksha K. Heart Disease Prediction Using CNN Algorithm. SN Computer Science. 2020;1(3). Available from: https://doi.org/10.1007/s42979-020-0097-6
  9. Kumar Y, Koul A, Singla R, Ijaz MF. Artificial intelligence in disease diagnosis: a systematic literature review, synthesizing framework and future research agenda. Journal of Ambient Intelligence and Humanized Computing. 2023;14(7):8459–8486. Available from: https://doi.org/10.1007/s12652-021-03612-z
  10. Haithamelwahsh E, El-Shafeiy S, Tawfeek MA. A new smart healthcare framework for real-time heart disease detection based on deep and machine learning”. Peer J Computer Science. 2021;7:1–34. Available from: https://doi.org/10.7717/peerj-cs.646
  11. Patro SP, Nayak GS, Padhy N. Heart disease prediction by using novel optimization algorithm: A supervised learning prospective. Informatics in Medicine Unlocked. 2021;26:1–17. Available from: https://doi.org/10.1016/j.imu.2021.100696
  12. Mehmood A, Iqbal M, Mehmood Z, Irtaza A, Nawaz M, Nazir T, et al. Prediction of Heart Disease Using Deep Convolutional Neural Networks”. Arabian Journal for Science and Engineering. 2021;46:3409–3422. Available from: https://doi.org/10.1007/s13369-020-05105-1
  13. Bertsimas D, Mingardi L, Stellato B. Machine Learning for Real-Time Heart Disease Prediction. IEEE Journal of Biomedical and Health Informatics. 2021;25(9):3627–3637. Available from: https://doi.org/10.1109/JBHI.2021.3066347
  14. Yazdani A, Varathan KD, Chiam YK, Malik AW, Ahmad WAW. A novel approach for heart disease prediction using strength scores with significant predictors. BMC Medical Informatics and Decision Making. 2021;21(1):1–16. Available from: https://doi.org/10.1186/s12911-021-01527-5
  15. Garate-Escamila AK, Hassani AHE, Andres E. Classification models for heart disease prediction using feature selection and PCA. Informatics in Medicine Unlocked. 2020;19:1–11. Available from: https://doi.org/10.1016/j.imu.2020.100330
  16. Sushma SJ, Assegie TA, Vinutha DC, Padmashree S. An improved feature selection approach for chronic heart disease detection. Bulletin of Electrical Engineering and Informatics. 2021;10(6):3501–3506. Available from: https://doi.org/10.11591/eei.v10i6.3001
  17. Krishnamoorthi R, Joshi S, Almarzouki HZ, Shukla PK, Rizwan A, Kalpana C, et al. A Novel Diabetes Healthcare Disease Prediction Framework Using Machine Learning Techniques. Journal of Healthcare Engineering. 2022;2022:1–10. Available from: https://doi.org/10.1155/2022/1684017
  18. Chen RC, Dewi C, Huang SW, REC. Selecting critical features for data classification based on machine learning methods. Journal of Big Data. 2020;7:1–26. Available from: https://doi.org/10.1186/s40537-020-00327-4
  19. Yahaya L, Oye ND, Garba EJ. A Comprehensive Review on Heart Disease Prediction Using Data Mining and Machine Learning Techniques”. American Journal of Artificial Intelligence. 2020;4(1):20–29. Available from: https://doi.org/10.11648/j.ajai.20200401.12
  20. Zhang D, Chen Y, Chen Y, Ye S, Cai W, Jiang J, et al. Heart Disease Prediction Based on the Embedded Feature Selection Method and Deep Neural Network. Journal of Healthcare Engineering. 2021;2021:1–9. Available from: https://doi.org/10.1155/2021/6260022
  21. Ali F, El-Sappagh S, Islam SMR, Kwak D, Ali A, Imran M, et al. A smart healthcare monitoring system for heart disease prediction based on ensemble deep learning and feature fusion. Information Fusion. 2020;63:208–222. Available from: https://doi.org/10.1016/j.inffus.2020.06.008


© 2024 Muthulakshmi & Parveen. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)


Subscribe now for latest articles and news.