Indian Journal of Science and Technology
Year: 2016, Volume: 9, Issue: 3, Pages: 1-12
*Author For Correspondence
Rashmi Agrawal Manav Rachna International University, Faridabad [email protected]
The classification of uncertain data has become one of the tedious processes in the data-mining. The uncertain dataset contains tuples with different and multiple data and thus to predict correct output class is a complex process. In this paper, we present two algorithms for classifying the uncertain data using the KNN classifier, which handles the uncertain dataset in two different ways to discover the corresponding class. In both algorithms, we split the database into two portions. The first portion is named as training dataset and the second portion is name as testing dataset. In the first algorithm, we used the properties of uncertain data to convert the uncertain data into certain data. The algorithm 1 initially converts the uncertain data to certain data and then it utilizes the KNN algorithm to classify data through the distance measure. The second algorithm converts the uncertain data through probability distribution function (pdf). The algorithm 2, initially calculates the N number of split point for each attributes of the training part of uncertain data then it calculates pdf with respect to the selected split point. The same process is applied for testing portion of uncertain data; subsequently algorithm 2 employs the KNN algorithm to classify the converted data. Finally, we compared our proposed algorithm with the UDT (Uncertain Decision Tree) algorithm with the four real datasets such as iris dataset, ionosphere dataset, breast cancer dataset, glass dataset and we proved that our proposed algorithms performed well than the UDT algorithm in terms of accuracy.
Keywords: Classification, K Nearest Neighbor Algorithm (K-NN), Probability Distribution Function, Uncertain Data, Uncertain Data Classification
Subscribe now for latest articles and news.