• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology

Article

Indian Journal of Science and Technology

Year: 2021, Volume: 14, Issue: 6, Pages: 519-526

Original Article

Novel algorithm for efficient privacy preservation in data analytics

Received Date:29 September 2020, Accepted Date:12 January 2021, Published Date:19 February 2021

Abstract

Objective: To address the modern privacy threats in data analytics by designing an efficient privacy preserving data analytics technique. Methods: The method applied is a non anonymized method that uses the concepts of synthesizing quasi identifiers and application of differential privacy. The proposed method was applied to three data sets viz. Adult data set, Statlogdata set and Indian Liver Patient data set. All the data sets are freely available in the UCI repository. Findings: The study presents “Synthesize Quasi Identifiers and apply Differential Privacy” (SQIDP) which is proved to be a more efficient and scalable algorithm. Compared to anonymity based algorithms SQIDP is not prone to similarity attacks, background knowledge attacks, attribute disclosure, and inference attacks. Anonymization, cryptographic, SWARM, and randomization methods will reduce data utility whereas SQIDP offers 100% data utility. Hence it is more efficient than other techniques. SQIDP was applied on three different data sets with 270, 583, and 48842 records but the execution time of the algorithm remained the same for all three data sets. SQIDP is proved to be a better privacy preservation technique with 100% data utility because it is not anonymized that abides by the recommendation in many privacy legislations like GDPR (General Data Protection Regulation) of the European Union and PDP (Personal Data Protection bill) of India.

Keywords: Data privacy; privacy regulations; privacy preservation; synthetic data; differential

References

  1. Madan S, PG. A privacy preservation model for big data in map-reduced framework based on k-anonymisation and swarm-based algorithms. International Journal of Intelligent Engineering Informatics. 2020;8:38–53. Available from: https://doi.org/10.1504/IJIEI.2020.105433
  2. Eyupoglu C, Aydin M, Zaim A, Sertbas A. An Efficient Big Data Anonymization Algorithm Based on Chaos and Perturbation Techniques. Entropy. 2018;20(5). Available from: https://dx.doi.org/10.3390/e20050373
  3. Nayahi JJV, Kavitha V. Privacy and utility preserving data clustering for data anonymization and distribution on Hadoop. Future Generation Computer Systems. 2017;74:393–408. Available from: https://dx.doi.org/10.1016/j.future.2016.10.022
  4. Rao PRM, Krishna SM, Kumar APS. Privacy preservation techniques in big data analytics: a survey. Journal of Big Data. 2018;5(1). Available from: https://dx.doi.org/10.1186/s40537-018-0141-8
  5. Zhou Y, et al. User attitudes and behaviors toward personalized control of privacy settings on smartphones. Concurrency and Computation: Practice and Experience. 2019;31(22). Available from: https://doi.org/10.1002/cpe.4884
  6. Mavriki P, Karyda M. Automated data-driven profiling: threats for group privacy. Information & Computer Security. 2019;28(2):183–197. Available from: https://dx.doi.org/10.1108/ics-04-2019-0048
  7. March E, Litten V, Sullivan DH, Ward L. Somebody that I (used to) know: Gender and dimensions of dark personality traits as predictors of intimate partner cyberstalking. Personality and Individual Differences. 2020;163. Available from: https://dx.doi.org/10.1016/j.paid.2020.110084
  8. Chen L, Gong T, Kosinski M, Stillwell D, Davidson RL. Building a profile of subjective well-being for social media users. PLOS ONE. 2017;12(11). Available from: https://dx.doi.org/10.1371/journal.pone.0187278
  9. Beaulieu-Jones BK, Wu ZS, Williams C, Lee R, Bhavnani SP, Byrd JB, et al. Privacy-Preserving Generative Deep Neural Networks Support Clinical Data Sharing. Circulation: Cardiovascular Quality and Outcomes. 2019;12(7). Available from: https://dx.doi.org/10.1161/circoutcomes.118.005122
  10. Wang P, Chen T, Wang Z. Research on privacy preserving data mining. Journal of Information Hiding and Privacy Protection. 2019;1(2):61–68. Available from: https://doi.org/10.32604/jihpp.2019.05943
  11. Dua D, Graff C. 2019. Available from: http://archive.ics.uci.edu/ml
  12. Rubinstein B, Ip F, Alda. Available from: https://cran.r-project.org/web/packages/diffpriv/vignettes/diffpriv.pdf
  13. Abowd JM. The US census bureau adopts differential privacy. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2018.

Copyright

© 2021 Ram Mohan Rao et al.This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)

DON'T MISS OUT!

Subscribe now for latest articles and news.