• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology


Indian Journal of Science and Technology

Year: 2023, Volume: 16, Issue: 45, Pages: 4141-4155

Original Article

Multioutput Ensemble Machine Learning Algorithm: A Prediction Model of Acute Respiratory infection and Pneumonia Occurrence

Received Date:29 April 2023, Accepted Date:02 November 2023, Published Date:05 December 2023


Objectives: To forecast daily OPD patients based on air pollution and weather parameters, the objective is to build a robust model that accurately predicts patient volume by considering major missing values and factors such as PM2.5 levels, temperature, humidity, wind speed, and rainfall, etc. thereby improving healthcare planning and delivery. Methods: To develop the multioutput ensemble model for forecasting daily OPD (out-patient department), we have used 13 machine learning techniques such as regression analysis, Extra tree regressor, Support vector regressor, etc. We have collected and pre-processed data from multiple sources, including air quality and weather parameters from NASA’s website, and historical healthcare data from Shatabdi Hospital, Govandi, Mumbai. We have developed the model using a combination of Gaussian regressor and Extra tree regressor and evaluated its performance using metrics such as FastDTW, RMSE, etc. Findings: The prediction result shows that the multioutput ensemble model performed significantly better than other models even with the presence of outliers, multicollinearity, and non-stationarity with Root Mean Squared Error 0.46 and 0.22 for ARI and Pneumonia with lag 7 days and 8 days respectively. Moreover, this model also worked well including Covid-19 period data when there was a negligible correlation between independent and dependent variables. Novelty: None of the datasets that have been used for the prediction of time series data have had a significant gap in recorded data in the time domain which has been effectively taken care of in this research. Secondly, all the earlier research work in this domain addresses only a single disease that provides the same lag value irrespective of the disease. The period of expression after the event occurrence may vary for multiple diseases, albeit in one domain that is triggered by similar and /or different air pollutants. This issue has been addressed by ensembling multiple ML algorithms to effectively optimize time complexity.

Keywords: Acute Respiratory Infection, Pneumonia, Gaussian Regressor, Extra Tree Regressor, Weather Data, Air Pollution


  1. Khatri KL, Tamil LS. Early Detection of Peak Demand Days of Chronic Respiratory Diseases Emergency Department Visits Using Artificial Neural Networks. IEEE Journal of Biomedical and Health Informatics. 2018;22(1):285–290. Available from: https://doi.org/10.1109/JBHI.2017.2698418
  2. Ku Y, Kwon SB, Yoon JH, Mun SK, Chang M. Machine Learning Models for Predicting the Occurrence of Respiratory Diseases Using Climatic and Air-Pollution Factors. Clinical and Experimental Otorhinolaryngology. 2022;15(2):168–176. Available from: https://doi.org/10.21053/ceo.2021.01536
  3. Ravindra K, Bahadur SS, Katoch V, Bhardwaj S, Kaur-Sidhu M, Gupta M, et al. Application of machine learning approaches to predict the impact of ambient air pollution on outpatient visits for acute respiratory infections. Science of The Total Environment. 2023;858:159509. Available from: https://doi.org/10.1016/j.scitotenv.2022.159509
  4. Kim MS, Lee JH, Jang YJ, Lee CH, Choi JH, Sung TE. Hybrid Deep Learning Algorithm with Open Innovation Perspective: A Prediction Model of Asthmatic Occurrence. Sustainability. 2020;12(15):6143. Available from: https://doi.org/10.3390/su12156143
  5. Inness A, Ades M, Agustí-Panareda A, Barré J, Benedictow A, Blechschmidt AM, et al. The CAMS reanalysis of atmospheric composition. Atmospheric Chemistry and Physics. 2019;19(6):3515–3556. Available from: https://doi.org/10.5194/acp-19-3515-2019
  6. Chowdhury S, Lin Y, Liaw B, Kerby L. Evaluation of Tree Based Regression over Multiple Linear Regression for Non-normally Distributed Data in Battery Performance. 2022 International Conference on Intelligent Data Science Technologies and Applications (IDSTA). 2022;p. 17–25. Available from: https://doi.org/10.1109/IDSTA55301.2022.9923169


© 2023 Pimpale & Pandit.  This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)


Subscribe now for latest articles and news.