Indian Journal of Science and Technology
DOI: 10.17485/IJST/v16i10.2102
Year: 2023, Volume: 16, Issue: 10, Pages: 744-755
Original Article
P Usha1, M P Anuradha2*
1Assistant Professor, Department of Information technology, Bishop Heber College, Affiliated to Bharathidasan University, Tiruchirappalli, 620 024, Tamil Nadu, India
2Assistant Professor, Department of Computer Science, Bishop Heber College, Affiliated to Bharathidasan University, Tiruchirappalli, 620 024, Tamil Nadu, India
*Corresponding Author
Email: [email protected]
Received Date:30 October 2022, Accepted Date:08 February 2023, Published Date:11 March 2023
Objectives: This review focuses on various feature selection process, strategy, and methods such as filter, wrapper and embedded algorithms and its advantages and disadvantages are presented. Methods: The algorithms such as Mutual Information Gain (MIG), Chi-Square (CS) and Recursive Feature Elimination (RFE) are used to select features. In this review, two benchmark datasets: Breast cancer and Diabetes are used. Findings: To improve the efficiency, selection of appropriate feature selection methods and algorithms are most important. To measure the performance of these selected features Random Forest model used as classifiers and compared with Support Vector Machine and Decision Tree models. Filter method and algorithm selects up to 15 features out of 17 for diabetes dataset with 89 % to 98 % of accuracy. For breast cancer dataset, up to 28 features out of 31 features selected with 98.5 % of accuracy. Wrapper method RFE selects 14 features from 17 for diabetes and 10 out of 31 features selected for breast cancer. This RFE method shows up to 98.25 % of accuracy for diabetes and 99.20% of accuracy for breast cancer. Novelty: Feature selection techniques help to improve the performance, efficiency and decrease the storage and processing time and build a better model for further process in prediction. The proper feature selection helps to diagnose diseases at an earlier stage and improve the survival of human beings.
Keywords: Mutual Information Gain; ChiSquare; Recursive Feature Elimination; Support Vector Machine; Random Forest; Decision Tree
© 2023 Usha & Anuradha. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)
Subscribe now for latest articles and news.