Indian Journal of Science and Technology
Year: 2016, Volume: 9, Issue: 48, Pages: 1-7
Pooja Yadav, Anuradha and Poonam Sharma
Department of Computer Science, The North Cap University, Gurgaon – 122017, Haryana, India; [email protected], [email protected], [email protected]
Objectives: Due to advancements in storage technology, every moderate to large sized organization is keeping huge amount of multi facet data which is growing very fast. To deal with such enormous data, we need an efficient data analysis technique like classification and clustering. Methods/Statistical Analysis: Processing high dimensional data sets in the presence of noise and outliers can degrade the performance of any kind of data analysis task. The situation can even worse; if we are going for unsupervised classification (i.e. clustering).In this paper, we proposed a new method for incremental density based clustering for high dimensional data set with reasonable speed up. The proposed method fused with noise removal and outlier labeling technique is inspired from famous box plot method. Findings: The performance analysis of fusion is done on five high dimensional data sets taken from University California Irvine (UCI) repository along four cluster evaluation metrics (F-Measure, Entropy, Purity and Speed Up).The produced clustering results confirm the effectiveness of proposed fusion. Application/Improvements: The proposed technique can be refined by hybridizing it with some metaheuristic technique for stock exchange application.
Keywords: Box Plot, Density Based Clustering, DBSCAN, , Entropy and Incremental Partitioning
Subscribe now for latest articles and news.