Global warming and climate change are the current challenges for the living community on the earth. Global warming is primarily caused because of the excessive emission of carbon dioxide into the atmosphere. Also the extensive research on Global warming has highlighted the role of ozone as a core component of the global climate system. Ozone is a gas that is naturally present in our atmosphere and it is found primarily in two of the regions of the atmosphere; about 90% in the stratosphere and the remaining 10% in the troposphere. Total ozone at any location on the globe is defined as the sum of all the ozone in the atmosphere directly above that location and is measured in Dobson Unit (DU)^{ 1. }The thickness of Total Column Ozone (TCO) is affected by various aspects like weather systems, changes in the atmospheric chemicals like CH_{4}, N_{2}O and CFCs_{ }due to anthropogenic activities, QuasiBiennial Oscillation (QBO) and changes in the UV flux of solar insolation. Since TCO is one of the important component of the earth’s climate system, it is very essential to predict the variations in the TCO to understand the future changes to our environment.
Regression Analysis, Time Series Decomposition, Moving Average Methods are some of the conventional forecasting models. Machine Learning is an important branch of Artificial Intelligence and Data Science, and it allows machines to learn, think and solve complex problems including predictive analytics. An Artificial Neural Network (ANN) was a mathematical model that was driven by the functional feature of biological neural networks ^{2}. Deep learning is a broad subset of Machine Learning methods based on Artificial Intelligence Neural networks, developed to improve the accuracy of prediction models and was very effective even if the data were diverse, less structured or independent. Deep learning methods were successfully implemented for forecasting stock levels, traffic flow, sales levels, air pollutants, etc. ANN model was implemented in ^{3} to predict the mechanical properties such as plastic behavior, yield strength and tensile strength of aluminum alloys. The phase transitions in frustrated magnetic models were successfully determined by ^{4} using ANN. ANN model was utilized by ^{2} for the prediction of animal category. The authors ^{5} proposed a novel approach based on LSTMPSO for sale forecasting in ECommerce. LSTM approach performed high level accuracy for sales forecasting of goods with short term demands ^{6}. The authors ^{7} present an LSTM model for traffic flow prediction. ANN model and Multiple Linear Regression model adopted by ^{8, 9} for forecasting surface ozone density, CO, NO_{2}, and SO_{2} concentration using the historical concentration of data. The authors ^{10} discussed the implementation of LSTM and ARIMA models to forecast the Air pollutants such as CO and NH_{3}. The authors ^{11} adopted four different models such as ARIMA, FB Prophet, LSTM and 1DCNN to predict the concentration of PM_{2.5}. In the article, ^{12} the authors proposed the models LSTM and DAE (Deep Auto Encoders) for the prediction of historical PM_{2.5 }in South Korea. All the authors found that the LSTM model provided better results than conventional models. This work is focused on the prediction of TCO concentration over a tropical region, by using both the conventional and machine learning methods.
The TCO data for a period of 20 years (1^{st} January 2000 – 31^{st} December 2019) of a tropical region in India, covered by 13.00°N  80.18°E on the coast of Bay of Bengal is used in this study. The data were obtained from NASA’s SBUV  Merged Ozone Data base. (https://acdext.gsfc.nasa.gov/anonftp/toms/sbuv/MERGED). The meteorological data such as Relative Humidity (RH), Pressure (P), Dew Point (DEW), Earth Skin Temperature (EST), Atmospheric Temperature (T), Wind Speed at 50M (WS50M), Wind Speed at 10M (WS10M) and Solar Radiation Insolation Index(SRI) were sourced from NASA’s Power Data Access site (https://power.larc.nasa.gov).
The analysis of TCO concentration along with meteorological parameters has been carried out by using various statistical and mathematical tools. Correlation Analysis and Time Series Analysis were used to analyse the data for dependency, trend and the influences of meteorological parameters on ozone. The deep learning frameworks Tensorflow and Keras were used to build the proposed LSTM model. Both MLR model and LSTM model are described in the sections (2.12.4).
A time series is a sequential set of data points, measured typically over successive times. The four main components of a time series are Trend, Seasonal, Cyclic and Irregular. Trend affects the overall movement of average of the data across time series. The mediumterm cyclic changes in data describes as cyclic. Seasonal occurs when there is a pattern across time series at a repeated interval. Irregular pattern contributes to randomness in the time series data ^{13}.
Regression models are pure statistical models based on the empirical relations among different variables and are very useful in Prediction and Data Description. Multiple Linear Regression (MLR) assumes a linear relationship between the independent and dependent variables ^{8}, and it is used to examine the relationships between the variables. Once each of the independent factors has been determined to predict the dependent variable, the information on the multiple variables can be used to create an accurate prediction on the level of effect they have on the outcome variable. In general, the regression equation may be written as Equation 1.
Here, Y is the dependent variable, X_{1}, X_{2}, ……. Xs are independent variable, b_{0}, b_{1}, ..., b_{s} are called regression coefficients and e is the error in prediction (residual) for each case ^{14}. The coefficient of determination R^{2}^{ }is calculated to indicate the percent of how much of the total variance is explained by the independent variables.
Long shortterm memory (LSTM) neural network is a widely used RNN architecture in the field of deep learning ^{15}, that mainly solves the processing of sequence data in which each data segment has a correlation with the previous data segment. This idea was first proposed by Hochreite and Schmidhuber and the network consists of an input layer, recurrent hidden layers known as memory blocks and an output layer ^{16}. The memory unit structure of the LSTM network is shown in
The heart of the LSTM is its memory cell or memory module that consists of a loop unit and three gates such as an input gate, a forget gate and an output gate. A nonlinear function controls each gate switch to protect and control the state of the memory unit. In general, the output value of each gates is 0∼1, the sigmoid function is used to determine how much information can be given as input to the memory location ^{17}. The input gate is designed to control the writing of input information to the memory and it decides which new messages are to be remembered in the neuron state while the forget gate tells about the information which is to be thrown away from the cell state. The forget gate and output gate determine whether to save or release information from the memory at each decision point ^{18}. The output gate provides the activation signal to the final output of the LSTM block at timestamp ‘t’. The state and status of LSTM cell can be described by a set of mathematical equations. The gates equations for the LSTM are,
Where, i_{t} represents input gate, f_{t} represents forget gate and o_{t} represents the output gate.
‘*’ indicates element wise multiplication.
Where
The prediction efficiencies of the models developed were tested using following parameters: Root Mean Square Error (RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE). These parameters provide a general indication of the strength of the relation between observed and predicted data. These three parameters can be calculated using equations (810) respectively ^{21}.
where Oi and Pi denote the observed values and the predicted values; N denotes the number of evaluation samples. MAE and RMSE from the above equations were used to evaluate the absolute error. The performance of a model is said to be high when the values of RMSE and MAE are low. MAPE is used to measure the relative error; the smaller the error value, the closer the predicted result to the actual value.
The descriptive statistics of TCO data and meteorological parameters for the period from 2000 to 2019 are given in





TCO 
217.4 
288.6 
260.91 
13.21 
RH 
49.21 
92.18 
72.53 
7.47 
P 
99.25 
101.38 
100.45 
0.37 
DEW 
15.79 
26.48 
22.49 
1.93 
EST 
23.31 
34.96 
28.86 
2.43 
T 
22.37 
34.12 
28.05 
2.29 
WS50M 
1.24 
16.51 
5.39 
1.60 
WS10M 
1.00 
13.58 
4.38 
1.26 
SRI 
0.16 
7.86 
5.37 
1.49 
The Meteorological conditions affect the processes of ozone formation, transportation and dispersion. The changes in TCO vary from place to place depending upon the Meteorological parameters and Geographical factors. The Scatter Diagram and Karl Pearson’s coefficient of correlation methods were used as the measure of correlation.







RH 
0.3375 
30.64 
7303 
0.3576689 
0.3170259 
<2.2e16 
P 
0.7546 
98.27 
7303 
0.7642985 
0.7445435 
<2.2e16 
DEW 
0.6046 
64.86 
7303 
0.5898003 
0.6189078 
<2.2e16 
EST 
0.7449 
95.42 
7303 
0.734522 
0.7549431 
<2.2e16 
T 
0.8078 
117.13 
7303 
0.7997151 
0.8156544 
<2.2e16 
WS50M 
0.0999 
8.59 
7303 
0.0772381 
0.1226447 
<2.2e16 
WS10M 
0.0932 
7.99 
7303 
0.0703959 
0.1158629 
<2.2e16 
SRI 
0.1923 
16.74 
7303 
0.1700815 
0.2142519 
<2.2e16 
The results of the correlation analysis between the ozone concentration and meteorological parameters are shown for the level of significance p < 0.001. Correlation between TCO and relative humidity is found to be negative (r = 0.3375) which implies that the TCO level is inversely proportional to relative humidity.
The negative correlation between TCO and pressure (r = 0.7546), implies that increase in pressure is followed by decrease in ozone concentration. The Pearson’s correlation analysis shows a positive correlation with r = 0.6046 between TCO and Dew point over the study region. The value of correlation coefficients (r=0.7449, r=0.8078) indicates that there is a very strong and effective positive relationship between TCO and Earth Skin Temperature as well as Atmospheric Temperature. It is evident that the ozone variations are directly proportional to the temperature. The results obtained from the correlation of ozone concentrations with wind speed at 50M and 10M range is not significant in this study. The Correlation between TCO and Solar Radiation Insolation index is found to be positive(r=0.1923) and is shown in
The Time series analysis comprises of methods make an attempt to understand the underlying context of the data points and to make forecasting. For the study period, the minimum TCO value is observed in January 2013 and the maximum TCO value is observed in June 2004. In the monthly averages, the lowest concentration was obtained in December and the highest in May. Augmented DickeyFuller Test is adopted to check the stationarity of the ozone time series data. The test statistic value of 6.3676 which is less than the value of 3.431 at 1% indicates the data do not have a unit root and are stationary and this series can be used to build the models.
The TCO is estimated by using eight independent variables such as Relative Humidity(RH), Pressure(P), Dew point (DEW), Earth Skin Temperature(EST), Atmospheric Temperature (T), Wind Speed at 50M(WS50M), Wind Speed at 10M(WS10M) and Solar Radiation Insolation Index(SRI). Regression results obtained from the modeling of data set are shown in
y(TCO)=b_{0}+b_{1}*(RH)+b_{2}*(P)+b_{3}* (DEW)+b_{4}* (EST)+b_{5}* (T)+b_{6}* (WS50M) +b_{7}* (WS10M) +b_{8}* (SRI)  (11)
In equation (11), the constants b_{1}, b_{2},., are the regression coefficients of various parameters as described in





Regression constant 
b_{0} 
533.107 
410.137 
1.300 
RH 
b_{1} 
1.733 
1.057 
1.639 
P 
b_{2} 
5.880 
3.913 
1.503 
DEW 
b_{3} 
6.351 
1.509 
4.209 
EST 
b_{4} 
9.014 
4.536 
1.987 
T 
b_{5} 
3.157 
4.803 
0.657 
WS50M 
b_{6} 
22.170 
5.293 
4.188 
WS10M 
b_{7} 
27.772 
6.598 
4.209 
SRI 
b_{8} 
1.156 
1.198 
0.964 
The historical TCO data is partitioned into three sets; training, testing and validation data and they are used to build and train the model. By means of a random process, the data points are divided into two distinct sets: a training set consisting of 75% of data and the remaining 25% of the data points used for testing. Data for a period of 100 days (100data points) is used as validation set for verifying the predicted TCO. A time series consists of past 7305 days of observation is used to predict the values of TCO. A multilayered LSTM recurrent neural network is modelled to predict the values of TCO for the next 100 days. The model is defined with 200 neurons in 1^{st} hidden layer and 1 neuron in the output layer. To avoid overfitting on the sample data and to increase the generalization capacity of the model, the model is trained for optimal number of epochs ^{22}. The early stopping callback function in keras is utilized to monitor the loss accuracy values and it is found that the optimal number of epochs to train the sample data is 15 and ADAM optimizer is utilized in this model. The model is fit for 15 training epochs with a batch size of 32 and for this model, each time step is one day. LSTM Predicted (PTCO) and actual (ATCO) values of ozone are shown in
The predicted values by the MLR model and LSTM model were compared with the actual values. The prediction efficiencies of these two models were tested using statistical error analysis methods and the results are tabulated in





MLR 
3.807924 
2.93766 
1.152153 
0.809301 
LSTM 
3.074492 
2.574 
1.01 
0.921995 
The statistical comparison of two models were used to provide a general indication of the relationship between predicted and actual data.
This study helps to predict the level of TCO concentration over a tropical region by using conventional MLR and LSTM models. The prediction accuracy of the models was assessed by RMSE, MAE and MAPE values. The lower values of the error parameters obtained for LSTM indicates for its improved performance over the conventional MLR. Also the MLR method uses several meteorological parameters as input in the estimation of TCO whereas LSTM uses only the time series for the prediction. Thus, the LSTM model does not rely on meteorological parameters and this model suggests that it is possible to predict the TCO concentration over any region without the meteorological parameters.
The authors are thankful to National Aeronautics and Space Administration (NASA) for providing TCO data and also thankful to Physics Research Centre, S.T. Hindu College, Nagercoil.