Indian Journal of Science and Technology
Year: 2017, Volume: 10, Issue: 19, Pages: 1-6
Vibhor Singh, Priyansh Saxena, Siddharth Singh and S. Rajendran
Department of Information Technology, SRM University, Kattankulathur, Chennai – 603203, Tamil Nadu, India; [email protected], [email protected], [email protected], [email protected]
Background/Objective: Customer reviews are important for various fields (e.g. Movies,Products, Services). Moviereviews plays vital role in describing its success and failure. People have now become very specific on what movies to watch and what not to watch. Hence people don’t want to waste time on a movie that has bad reviews. Nowadaysonline reviews are important for personal recommendation. Methods/Statistical Analysis: There are various works done on opinion mining and text mining. This paper focuses on analyzing sentiment from semantic orientation of words that occur in a text by manually defining dictionary for positive, negative and intensifier words stored in different text file. In this method we extract data from three different sources. Opinions that has to be analyzed is preprocessed and stored in text file. The data are then compared with our bag of words in order to find the number of positive and negative sentiments in the reviews of that particular movie. To predict movie rating three machine learning algorithms have been used to create three different classifier using a trained data set. Findings: After rating prediction, the accuracy of each classifier is calculated on the data set containing predicted rating of various movies given by each classifier. Out of all three machine learning algorithms used, Naive Bayes is found to be more accurate than Decision Tree and K Nearest Neighbour algorithm. Improvements: There can be more many ways where we understand how the users write reviews as the reviews only support English language. Hence we can also bring in multiple languages, so that we do not even miss out on a single review. Most of the time,a single user may not provide review for all the movies. So, our database will resemble a sparse matrix. A prediction algorithm may be designed to mathematically guess the rating for movies that are not originally reviewed by the user.
Keywords: Machine Learning, Opinion Mining, Sentiment Analysis, Text Mining
Subscribe now for latest articles and news.