Indian Journal of Science and Technology
Year: 2017, Volume: 10, Issue: 20, Pages: 1-4
Priya Mohan1* and Ilango Paramasivam2
1Department of Computer Science, Bharathiar University, Coimbatore – 641046, Tamilnadu, India; [email protected] 2School of Computing Science & Engineering, VIT University, Vellore – 632014, Tamilnadu, India; [email protected]
*Author for the correspondence:
Department of Computer Science, Bharathiar University, Coimbatore – 641046, Tamilnadu, India; [email protected]
Objectives: The time complexity of the machine learning algorithm is directly proportionate to the dimension of the dataset. In this paper, he impacts of dimensionality of the dataset on the machine learning algorithm, Naïve-Bayes Classifier is evaluated with all feature subsets to analyze whether there is any variations in the performance. Methods/Statistical Analysis: Naïve Bayes Classifier is taken for the study to evaluate its variations in terms of its performance in correctly classified instances and incorrectly classified instances. Pima Indian Type II diabetes dataset is taken for the experimental study. Confusion matrix will be formulated for the performance of Naïve-Bayes Classifier using 10-fold cross validation for each run. The study exhibits the impact of the dimensionality on the performance of Naïve-Bayes Classifier. Findings: The Naïve Bayes classifier classifies the patient records either as diabetes or as non-diabetes using the values of the feature set. It is a probabilistic approach of classifying the patient records into the binary class. It is found that there is an impact on the performance of Naïve Bayes Classifier due to the dimensionality of the feature set it terms of Classification accuracy, number of true positives, true negatives, false positives and false negatives. The incorrect classification is certainly dangerous. Whereas the valid classification facilitates the healthcare systems in terms of planning effective course of treatment which will save the life of the patient. The invalid classification will lead to a wrong diagnosis while formulating the treatment plan and it will lead to loss of life. Hence, the invalid classification in terms of false negative rate is to be viewed very seriously. In this paper, the study shows that there is an impact on the performance of Naïve Bayes Classifier due to the higher dimensionality of the dataset. Application/Improvements: They will be used in medical Informatics for the quality diagnosis and effective treatment planning. The focus on the false positive rate in the classification accuracy of Naïve Bayes Classifier will notably help the healthcare systems to diagnose the patients accurately to save life.
Keywords: Classification Accuracy, Dimensionality Reduction, Machine Learning, Naïve-Bayes Classifier
Subscribe now for latest articles and news.