Indian Journal of Science and Technology
DOI: 10.17485/ijst/2019/v12i8/141809
Year: 2019, Volume: 12, Issue: 8, Pages: 1-9
Original Article
M. Sujaritha*, M. Kavitha and J. Janet
Department of Centre for Science and Environment, Sri Krishna College of Engineering and Technology, Kuniamuthur, Coimbatore - 641008, Tamil Nadu India; [email protected], [email protected], [email protected]
*Author for correspondence
M. Sujaritha
Department of Centre for Science and Environment, Sri Krishna College of Engineering and Technology, Kuniamuthur, Coimbatore - 641008, Tamil Nadu India.
Email: [email protected],
Objective: To associate or compare the goodness of fit of Principal Component Regression (PCR) and Partial Least Squares (PLS) models using metrics such as RMSEP, MSEP and R2. Methods and Statistical Analysis: Regression analysis is used in the study that involves investigation of correlation among an independent and dependent variables. Analysis is made simple when researchers understand and use the preeminent suitable method based on type of dependent variables, independent variables and dimensionality of data. Cross-validation method is used in both predictive models (PCR and PLS). Dataset and Findings: This study presents the comparative analysis on PCR and PLS by applying these methods on a public dataset named octane dataset, where the spectral data of gasolines with 401 attributes are provided. This study concludes that partial least squares regression model yields better prediction results than Principal Component Regression model since PLS accurately select the principal component. Also the number of principal components identified by the PLS is comparatively less. An analysis on preprocessing is also performed with same regression methods and dataset in this paper. Improvements: In this analysis the importance of removing Region of no interest is focused. If Region of No Interest is removed then number of principal component is also reduced which in turn increase the prediction accuracy. The study reveals the number of principal components is high if Region of No Interest is not used, which decreases the prediction accuracy.
Keywords: Pleast-Squares, Preprocessing, Principal Component Regression
Subscribe now for latest articles and news.