• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology


Indian Journal of Science and Technology

Year: 2016, Volume: 9, Issue: 39, Pages: 1-6

Original Article

Empirical Study of Feature Selection Methods for High Dimensional Data


Background/Objectives: Feature Selection is a process of selecting features that are relevant which is used in model ­construction by removing redundant, irrelevant and noisy data. A typical application of Text Mining is classification of messages and e-mails into spam and ham. Methods/Statistical Analysis: This article gives a comprehensive overview of the various Feature Selection methods for Text Mining. Various Filter methods like Pearson Correlation, Chi-square, Symmetrical Uncertainty and Mutual Information are applied to select the optimal set of features. Findings: Filter Feature Selection methods are used to classify Text data. Various Classification algorithms are applied using the optimal set of ­features obtained. The accuracy of classification algorithms are verified based on the chosen data set. Novelty/ Improvements: A comparative study of various filter methods for Feature Selection and classification algorithms for performance evaluation is conceded in this research work.
Keywords: Chi-Square, Feature Selection, Filter Method, Mutual Information, Pearson Correlation 


Subscribe now for latest articles and news.