Indian Journal of Science and Technology
Year: 2023, Volume: 16, Issue: 38, Pages: 3250-3257
Original Article
Himanshi Bhoria1*, Amita Dhankhar2, Kamna Solanki2
1M.Tech. Student Department of CSE, UIET, MDU Rohtak, India
2Associate Professor, Department of Computer Science & Engineering, University Institute of Engineering and Technology, MDU Rohtak, India
*Corresponding Author
Email: [email protected]
Received Date:19 April 2023, Accepted Date:29 August 2023, Published Date:13 October 2023
Objectives: The main goals of this study are: 1) To assess students’ performance using several machine learning models. 2) To identify the attributes influencing the student’s performance using feature selection. 3) To assess and compare machine learning model performance using accuracy, precision, recall, F-1 score, and AUC score (Area Under Curve) as performance indicators. 4) Compare the effectiveness of feature selection-based versus non-feature-based machine learning models. Methods: The student performance dataset from UCI has been taken for this study. It consists of 650 records with 32 features. The pertinent features are selected by applying the Chi-square method to facilitate the effective construction of the model. Further, the implementation has been performed by using the classification models. Lastly, how well the machine learning model has performed has been compared in terms of performance metrics namely accuracy, precision, recall, F-1 score, and AUC score. Findings: The findings related to the first objective showed that the outcome of the student performance is passed and failed. The experimental evaluation of the Decision tree (DT), random forest (RF), SVM, K-Nearest Neighbors Algorithm (KNN), and XGBoost are evaluated in terms of accuracy, precision, recall, F-1 score, and AUC score. The F-1 score achieved by the DT, RF, SVM, KNN, and XGBoost is 92.16, 95.06, 95.19,93.8 and 94.59 respectively. The finding to the second objective identifies the attributes: Failures, Schoolsup, First Period Grade(G1), Second Period Grade(G2), and Final Grade(G3) influence on students’ performance. The finding of the third Objective shows that Support Vector Machine classification model outperforms the other models with F-1 score of 95.19%. The finding related to the fourth objective identifies that Models with use feature selection techniques give more performance than the model which does not use it. Novelty: Using machine learning to predict students’ performance can revolutionize the education sector by providing a data-driven approach to evaluating academic performance. This research work proposed a new “Chi-Square Based Feature Selection” (CBFS) technique for the prediction of students’ performance. Moreover, using chi-square for feature selection involves selecting only the most relevant features, which helps reduce the model’s complexity and improves its performance.
Keywords: Machine Learning, Prediction, Dataset Problem, Early Warning System, Educational Data Mining
© 2023 Bhoria et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)
Subscribe now for latest articles and news.