• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology


Indian Journal of Science and Technology

Year: 2016, Volume: 9, Issue: 47, Pages: 1-13

Original Article

CATs-Clustered k-Anonymization of Time Series Data with Minimal Information Loss and Optimal Re-identification Risk


Background/Objectives: Time series is a significant type of data, widely used in diverse application such as financial, medical, and weather analyses, which in-turn contain personal privacy to a great extent. Methods/Statistical Analysis: The perquisite to protect privacy of time series data is to bolster the data holder to get involved in the above applications without any privacy threats. The k-anonymization approach of time series data has picked up consideration over late years, a key requirement of such an approach is to guarantee anonymization of time series data while minimizing the information loss caused from that approach. Findings: In this article, we implemented a novel methodology called CATs (Clustered k-Anonymization of Time Series Data) that applies the idea of clustering on time series data and ensure anonymization by gaining minimized information loss within venerable utility. The fundamental perception here is that the time series data tuples that are alike, ought to be a part of one cluster, and de-identification of these tuples is furnished. We thus formulate and proposed this approach as CATs, implemented through mishmash of WEKA and ARX anonymization tool. We have executed the solution on two benchmark time series data set available in UCR, Our experimental result strives that CATs confirms to have minimal information loss ranging from 18% to 24% reduction rate when compared with existing TSA (Time Series Anonymization) approaches. Applications/Improvements: As result of our experimentation, we express that our approach can play a remarkable role in the field of financial management, Online Medical process monitoring and management etc.

Keywords: Clustering, Information Loss, k-Anonymization, Privacy Preserving Data Mining, Re-Identification Risks, Time Series Data Mining


Subscribe now for latest articles and news.