Indian Journal of Science and Technology
DOI: 10.17485/ijst/2016/v9i31/95634
Year: 2016, Volume: 9, Issue: 31, Pages: 1-7
Review Article
Sai Prasad Potharaju1* and M. Sreedevi2
1 K L University, [email protected]
2 Department of Computer Science and Engineering, [email protected]
*Author for correspondence
Sai Prasad Potharaju
K L University,
Email: [email protected]
Objectives: This article presents a framework to improve the accuracy of rule induction and decision tree models. Analysis: In this paper, we used a rebalancing algorithm called SMOTE to enhance the accuracy of different induction and decision tree models in order to predict kidney disease of patients. For this prediction, data collected from Apollo Hospitals, Tamil Nadu, India has been analysed. Findings: In this research, initial dataset is not balanced i.e. most of the instances belong to the same class. If dataset is imbalanced, the traditional models can’t produce accurate results. Thus the proposed framework improves the accuracy of models by balancing the imbalanced dataset. For this, a technique for sampling the minority class called SMOTE is applied on existing dataset and percentage of variation between classes is minimized. The examined findings with various classifiers algorithms and with the use of over sampling algorithm, the produced findings proves an increasing accuracy and also those results are compared with balanced and imbalanced dataset. In particular, this method can attain the average accuracy of 98.73%. Applications: This method can be applied in other areas to improve the accuracy in case of imbalanced dataset. In case of Big Data also SMOTE can be applied using Hadoop framework and Mapreduce programming model with new algorithmic approach
Keywords: Classification, Data Mining, Health Informatics, Kidney Failure, SMOTE
Subscribe now for latest articles and news.