Year: 2023, Volume: 16, Issue: 33, Pages: 2663-2669

Performance Analysis of Feature Selection and Classification Methods for Predicting Dyslexia

Received Date:23 January 2023, Accepted Date:28 July 2023, Published Date:08 September 2023


Objectives: This study aims to select efficient and relevant features to detect Dyslexia with better accuracy using various Machine Learning (ML) models. Methods: A benchmark online gamified test dataset was used. Dyslexia from Kaggle used which contains 196 features. The dataset is divided as training and testing with 80-20%. Information Gain (IG), Principal Components Analysis (PCA), and Correlation Attribute Evaluation (CAE) are used to select relevant features. The performances of the selected features are evaluated using ML Classifiers models such as C 4.5, Random Forest (RF), Decision Table (DT), Logistic Regression (LR), and Support Vector Machine (SVM). Findings: Our feature selection method IG selects 192, PCA selects 195, and CAE selects 186 features out of 196 features. The selected features are tested with various above-mentioned ML classifier models. This study shows CAE with the LR classifier model well suited for select relevant features with 89.8% of accuracy. Novelty: This study presents a CAE feature selection approach with LR classifier approximately greater than 1.5 % accuracy of the existing approach of MIG, K-Best Features, and Recursive Feature Elimination in Random Forest. The proposed technique achieved improvement in accuracy.

Keywords: Machine Learning; Feature selection; Classification; Dyslexia


