Classification using Latent Dirichlet Allocation with Naive Bayes Classifier to detect Cyber Bullying in Twitter

K  Nalini  andL  Jaba Sheela

doi:10.17485/ijst/2016/v9i28/93825

Article

Classification using Latent Dirichlet Allocation with Naive Bayes Classifier to detect Cyber Bullying in Twitter

VIEWS 879
PDF 336

Abstract
Full-Text HTML
Full-Text PDF
How to Cite

Indian Journal of Science and Technology

DOI: 10.17485/ijst/2016/v9i28/93825

Year: 2016, Volume: 9, Issue: 28, Pages: 1-5

Original Article

Classification using Latent Dirichlet Allocation with Naive Bayes Classifier to detect Cyber Bullying in Twitter

K. Nalini¹ andL. Jaba Sheela²

¹Bharathiyar University, [email protected]
² Panimalar Engineering College, [email protected]

This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

Objectives: Social networks are becoming a risk for minors especially those are using it regularly. This action can also lead to Cyber bullying. The unstructured texts which are present in the enormous amount of information cannot simply be used for further processing by computers. So, the specific preprocessing methods and algorithms are needed in order to extract useful patterns. Methods/Analysis: One of the important research issues in the field of text mining is Text Classification. The Twitter corpus is used as the training and test data to build a sentiment classifier. The positive or negative sentiments of a new tweet are used to detect Cyber Bullying messages in Twitter using LDA with Naive Bayes classifier. Findings: The result shows that our model gives the better result of precision, recall and F-measure as nearly 70%. Naive Bayes is the most appropriate algorithm comparing with other algorithms like J48 and Knn. The CPU processing time for Naive Bayes algorithm is comparatively less than the other two classification algorithm. Improvements: The performance of the system can be improved by adding extra features to more amount of data.
Keywords: Cyber Bullying, LDA, Naive Bayes, Text Mining, Twitter