Indian Journal of Science and Technology
Year: 2017, Volume: 10, Issue: 18, Pages: 1-8
A. Sheik Abdullah1*, S. Selvakumar2 , P. Karthikeyan1 and M. Venkatesh3
1Department of Information Technology, Thiagarajar College of Engineering, Madurai – 625015, Tamil Nadu, India; [email protected], [email protected] 2Department of Computer Science and Engineering, G.K.M. College of Engineering and Technology, Chennai – 600063, Tamil Nadu, India; [email protected] 3Department of Medicine, Theni Government Medical College and Hospital, Theni – 625531, Tamil Nadu, India; [email protected]
*Author for correspondence A. Sheik Abdullah Department of Information Technology, Thiagarajar College of Engineering, Madurai – 625015, Tamil Nadu, India; [email protected]
The objective of this research work focus towards the identification of best variant between decision tree algorithm such as Weighted Decision Trees (WDT), C4.5 Decision Trees and C5.0. Methods: Decision tree has a number of variants such as ID3, Weight based decision tree, C4.5 and C5.0 algorithms. This research work focus towards the predictive performance analysis of weight based decision tree with information gain as the splitting criterion. The algorithm proceeds iteratively with the assignment of weights over the training instances to determine the best among the data attributes. Thereby, the attribute with best weight values can be significantly determined by an improvement over its accuracy. Results: The experimental results proves that among the variants of decision trees the algorithm corresponding to C4.5 provides the highest accuracy of about 71.42% and R2 value of about 0.265 respectively and for real world data the accuracy is about 48.69%. The effectiveness of the decision tree algorithm can be still improved by adopting certain feature selection techniques with the combination of decision tree algorithm. Conclusion: The determined results show that Decision tree algorithm suits well for medical data problems. The efficiency of the algorithm can still be improved by applying Decision Trees for various real world data problems such as Diabetes, Cancer classification with feature selection paradigms. But still a larger set of real world data has to be investigated.
Keywords: C4.5 Decision Tree Algorithm, Data Classification, Heart Disease, Predictive Analysis, Weighted Decision Tree
Subscribe now for latest articles and news.