• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology

Article

Indian Journal of Science and Technology

Year: 2016, Volume: 9, Issue: 26, Pages: 1-7

Original Article

Detection of Projected Outliers from the Higher Dimensional data sets using Extended Kalman Filter and Fuzzy K-Means

Abstract

Objectives: Curse of Dimensionality and the attribute relevance is the matter of great concern now these days while dealing with the higher dimensional data sets or Big Data, especially to detect the projected outliers. The objective of this research paper is to construct a Robust and a scalable model to prominently highlight the higher dimensional outliers in an effective and an efficient manner. Methods/Analysis: In order to detect the projected outliers, an algorithm EKFFKMeans with a hybrid approach is constructed using two important methodologies- Extended Kalman Filter (EKF) and Fuzzy K-Means. EKF is used to linearize the higher dimensional data by estimating the current mean and covariance by enhancing the Kalman gain and then fuzzy K-Means confirms the outlying property of each data instance and categorizes them in an effective and an efficient way using the membership label. Findings: A model EKFFK-Means is constructed that further creates 30 clusters from the complete data set to detect the projected outliers and various parameters like accuracy, cluster validity, True positive rate, False positive rate , robustness and cluster quality are calculated. Improvements: This algorithm is further compared with HPStream and CLUStream and is proved better against various parameters. 
Keywords: Clustering, Projected Outliers, Robustness, Scalability, Unsupervised 

DON'T MISS OUT!

Subscribe now for latest articles and news.