Indian Journal of Science and Technology
Year: 2016, Volume: 9, Issue: 46, Pages: 1-6
Dipak Ramchandra Kawade1* and Kavita S. Oza2
1Department of Computer Science, Sangola College, Sangola – 413307, Maharashtra, India; [email protected] 2Department of Computer Science, Shivaji University, Kolhapur – 416004, Maharashtra, India; [email protected]
*Author for correspondence
Dipak Ramchandra Kawade
Department of Computer Science, Sangola College, Sangola – 413307, Maharashtra, India; [email protected]
Objectives: Text classification is one of the important applications of data mining. Text classification classifies text documents on the basis of words, phrases, combination of words etc. into predefined class labels. Method/Analysis: Present study classifies news data into four predefined classes namely Business, Entertainment, sports and Technology. For text classification WEKA an open source data mining tool is used. Different classification algorithms are applied on News data set. A comparative study of these algorithms is done based on Accuracy, Time, Errors and ROC to predict the best algorithm for news data set classification. Findings: Present study analyzed result on the basis of accuracy, time, error and ROC curve. Present work concludes that NaïveBayes Multinomial algorithm is best for news classification.
Keywords: Classification Algorithms, Data Mining, Text Classification, WEKA
Subscribe now for latest articles and news.