• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology


Indian Journal of Science and Technology

Year: 2022, Volume: 15, Issue: 17, Pages: 790-797

Original Article

Performance Evaluation of Sentiment Analysis on Balanced and Imbalanced Dataset Using Ensemble Approach

Received Date:16 December 2021, Accepted Date:24 March 2022, Published Date:10 May 2022


Background: Class imbalance is often discussed as a strenuous task in the realm of sentiment analysis. In an imbalanced classification, few minority class instances are unable to provide sufficient information, therefore direct learning from an unbalanced dataset can produce unsatisfactory results. This work aims to address the problem of class imbalance. Methods: At primary level this study uses a novel Synthetic Minority Oversampling Technique (SMOTE) for balancing the dataset and then proposes an ensemble model, named Ensemble Bagging Support Vector Machine (EBSVM) for opinion mining. To measure the performance of the particular approach, numerous analyses are conducted on both imbalanced and balanced datasets. Then the work compares the effectiveness of the suggested model with three base classifiers (Nave Bayes (NB), Decision Tree (DT), and Support Vector Machine (SVM)). The customer reviews for restaurants is chose as the dataset for this work. Accuracy, precision, recall and F-measure are used as metrics for evaluation. Findings: According to the results, the suggested EBSVM model excels all other individual classifiers with the imbalanced and SMOTE balanced dataset. The balanced EBSVM classifier improves the imbalanced EBSVM Classifier in terms of accuracy. Precision, recall and F-measure of the minority class in the imbalanced classifiers have been improved in balanced Classifiers. Novelty: The performance of opinion mining classifiers for imbalanced and balanced datasets is evaluated in this paper. The work examines not only general opinions, but also specific aspects such as food, service, ambiance, quality, and price. Comparing the suggested model with existing classification algorithms in the literature, it has found that it outperformed the other models.

Keywords: Bagging; Accuracy; Ensemble; Precision; Recall; Fmeasure


  1. Kumar P, Bhatnagar R, Gaur K, Bhatnagar A. Classification of Imbalanced Data:Review of Methods and Applications. IOP Conference Series: Materials Science and Engineering. 2021;1099(1):012077. Available from: https://dx.doi.org/10.1088/1757-899x/1099/1/012077
  2. Bach M, Werner A, Palt M. The Proposal of Undersampling Method for Learning from Imbalanced Datasets. Procedia Computer Science. 2019;159:125–134. Available from: https://dx.doi.org/10.1016/j.procs.2019.09.167
  3. Monica M, Ochani SD, Sawarkar M, Narwane S. A Novel Approach to Handle Class Imbalance in Machine Learning. International Journal of Engineering Research & Technology (IJERT). 2019;8:2278–0181. Available from: www.ijert.org
  4. Mohammadi A, Shaverizade A. Ensemble Deep Learning for Aspect-based Sentiment Analysis. International Journal of Nonlinear Analysis and Applications. 2021;12:2008–6822. Available from: http://dx.doi.org/10.22075/IJNAA.2021.4769
  5. Desuky AS, Hussain S. An Improved Hybrid Approach for Handling Class Imbalance Problem. Arabian Journal for Science and Engineering. 2021;46(4):3853–3864. Available from: https://dx.doi.org/10.1007/s13369-021-05347-7
  6. Govindarajan M. A Comparative Analysis of Ensemble Classifiers for Text Categorization. International Journal of Advanced Trends in Computer Science Engineering. 2020;9(1). Available from: https://doi.org/10.30534/ijatcse/2020/51912020
  7. Sazzed S, Jayarathna S. SSentiA: A Self-supervised Sentiment Analyzer for classification from unlabeled data. Machine Learning with Applications. 2021;4:100026. Available from: https://dx.doi.org/10.1016/j.mlwa.2021.100026
  8. Mishra RK, Urolagin S, Jothi JAA, Neogi AS, Nawaz N. Deep Learning-based Sentiment Analysis and Topic Modeling on Tourism During Covid-19 Pandemic. Frontiers in Computer Science. 2021;3. Available from: https://dx.doi.org/10.3389/fcomp.2021.775368
  9. Basha SM, Rajput DS. A supervised aspect level sentiment model to predict overall sentiment on tweeter documents. International Journal of Metadata, Semantics and Ontologies. 2018;13(1):33.
  10. Monisha VN, Madhumitha N, Nimishamba B, Poornima N, Dr, R “SD. Sentimental Analysis of Movie Reviews using Twitter”. NCETESFT - 2020 Conference Proceedings. 2020;p. 2278–0181. Available from: www.ijert.or
  11. Ahuja R, Chug A, Kohli S, Gupta S, Ahuja P. The Impact of Features Extraction on the Sentiment Analysis. Procedia Computer Science. 2019;152:341–348. Available from: https://dx.doi.org/10.1016/j.procs.2019.05.008
  12. Kalaivani P, Dinesh D. Machine Learning Approach to Analyze Classification Result for Twitter Sentiment. 2020 International Conference on Smart Electronics and Communication (ICOSEC). 2020;ISBN:978–979. doi: 10.1109/ICOSEC49089.2020.9215278
  13. Raut A, Rahul K, Pandey. Sentiment Analysis using Optimized Feature Sets in Different Twitter Dataset Domains. International Journal of Innovative Technology and Exploring Engineering. 2019;8(11):3035–3039. doi: 10.35940/ijitee.K2195.0981119
  14. Alharbi NM, Alghamdi NS, Alkhammash EH, Amri JFA. Evaluation of Sentiment Analysis via Word Embedding and RNN Variants for Amazon Online Reviews. Mathematical Problems in Engineering. 2021;2021:1–10. Available from: https://dx.doi.org/10.1155/2021/5536560
  15. Vashishtha S, Susan S. Fuzzy Interpretation of Word Polarity Scores for Unsupervised Sentiment Analysis. 2020 11th International Conference on Computing, Communication and Networking Technologies (ICCCNT). 2020. doi: 10.1109/ICCCNT49239.2020.9225646
  16. Sazzed S. A Hybrid Approach of Opinion Mining and Comparative Linguistic Analysis of Restaurant Reviews. Proceedings of the Conference Recent Advances in Natural Language Processing - Deep Learning for Natural Language Processing Methods and Applications. 2021;p. 1281–1288. Available from: https://doi.org/10.26615/978-954-452-072-4_144


© 2022 George & Srividhya. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Published By Indian Society for Education and Environment (iSee)


Subscribe now for latest articles and news.