• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology


Indian Journal of Science and Technology

Year: 2015, Volume: 8, Issue: 26, Pages: 1-7

Original Article

Framework of Data Deduplication: A Survey


To understand the concept and framework of de-duplication process along with application, various methods and technologies involved during the each level implementation of this process and about the Limitations. Different chunking algorithm such as fixed, variable and content aware chunking methods are used to decide the chunk size of deduplication. To avoid the single point failure in the distributed system, cluster model with parallel process can be used. Comparing with and without deduplication, we can save up to 75% of storage space in the backup system by using with deduplication. Digital data increase happens in all cloud deployment models and this requires more storage capacity, more costs, manpower and more time to handle data information like backup, replication and disaster recovery, more bandwidth utilization in transmitting the data across the network. If we handle the data effectively like remove the redundant data before storing into the storage device we can avoid data handling overhead and we can improve the system performance. By using data deduplication concept we can achieve the above results. The variable chunking method including content aware process yields good throughput in the deduplication system compare with fixed and whole file chunking method. Inline process avoids extra required spaces and increase system performance. Cluster model eliminates single point failure in the distributed deduplication system. Deduplications save more storage spaces, avoid tohandle unnecessary data over head and provides less resources utilization with minimal cost.
Keywords: Chunking, Cluster Deduplication, Data Deduplication, Duplicate Detection, Fingerprint Calculation


Subscribe now for latest articles and news.