Indian Journal of Science and Technology
Year: 2015, Volume: 8, Issue: 22, Pages: 1-9
Amalraj Irudayasamy* and L. Arockiam
BackGround/Objectives: Anonymizing data sets through generalization satisfies certain privacy concerns such as k-anonymity that are broadly used as privacy conserving procedures. Parallel bottom-up generalization approach is introduced to anonymize huge datasets by map reduce structure on public cloud. A group of innovative map reduce jobs are formulated to perform the generalization in an exceedingly scalable manner. Methods/Statistical Analysis: Map Reduce, a widely-adopted parallel data processing framework is introduced, to address the privacy preservation problem with minimum information loss of the Bottom-Up Generalization (BUG) approach for large-scale data anonymization. To make full use of parallelism feature of Map reduce on cloud the whole process are split into two phases. Firstly, unique datasets are partitioned into a collection of lesser datasets, and these datasets are anonymized in parallel, giving intermediary outcome. Secondly,the intermediate results are combined and anonymized,to attain consistent k-anonymous data sets. Map Reduce concept is used to accomplish the computation in both phases. Findings: In this paper, investigational evaluation, results to gain high privacy preservation with minimum information loss in less execution time when compared to the existing approaches. The results demonstrate the insufficiency of the state-of-the art sub-tree anonymization approaches when handling large data sets. According to the tendencies of execution time and Information Loss, it is necessary and reasonable to choose MRBUG to perform parallel generalized data anonymization for large data according to the value of k. Applications/Improvements: Optimized, heuristic, and balanced scheduling approaches are expected to be developed towards overall scalable privacy preservation. It is believed that the structure of bottom-up generalization is amenable to several extensions that make it more practical. Incorporating different metrics and handling data suppressions in partial generalization is not necessarily require to have all child values generalized altogether. It is also possible to generalize numeric attributes without a pre-determined hierarchy and shall be taken up as a future work.
Keywords: Bottom-Up Generalization, Cloud, Data Anonymization, Map Reduce, Privacy Preservation
Subscribe now for latest articles and news.