Indian Journal of Science and Technology
Year: 2015, Volume: 8, Issue: 24, Pages: 1-7
S. N. Vinithra*, S. J. Arun Selvan, M. Anand Kumar and K. P. Soman
Centre for Excellence in Computational Engineering and Networking, Amrita Vishwa Vidyapeetham, Coimbatore - 641 112, Tamil Nadu, India;
[email protected], [email protected]
We present a methodology for naturally grouping the estimation of Twitter messages. Miniaturized scale websites are a testing new wellspring of data for information mining methods. The aim of this paper is to focus the careful feeling of the information from the microblogging site Twitter. Tweets regularly likewise contain URLs to different sites. Tweets additionally contain a certain measure of OOV (Out-Of-Vocabulary) words, for example, Hash tags, a labeling framework for points permitting Tweets in a comparative vein of discussion to be found. Other OOV words incorporate notice which is a system to direct a Tweet to one or more users. The KH coder tool gives a conventional precision result where the content is POS labeled and MySQL is utilized for putting away points of interest as a part of the database. The R tool is utilized to view the factual examination of information. Further, machine learning calculation has likewise been performed. A preprocessing and highlight choice system in blend with a Maximum Entropy, Naive Bayes and Decision Tree classifiers has been exhibited and sensible results has been delivered. Accuracy of the machine adapting methods for sentiment has been thought about and statistical representation of the classes has been depicted through KH Coder.
Keywords: Data Mining, Decision Tree, Maximum Entropy, Microblogging, Naive Bayes, OOV, Sentiment Classification
Subscribe now for latest articles and news.