Full Paper View Go Back
Various Chunking and Deduduplication Techniques in Big Data
Naresh Kumar1 , Ishu Devi2
- CSE Department, UIET-Kurukshetra University, Kurukshetra, India.
- CSE Department, UIET-Kurukshetra University, Kurukshetra, India.
Correspondence should be addressed to: ishupunia81@gmail.com.
Section:Review Paper, Product Type: Isroset-Journal
Vol.5 ,
Issue.3 , pp.129-131, Jun-2017
Online published on Jun 30, 2017
Copyright © Naresh Kumar, Ishu Devi . This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
View this paper at Google Scholar | DPI Digital Library
How to Cite this Paper
- IEEE Citation
- MLA Citation
- APA Citation
- BibTex Citation
- RIS Citation
IEEE Style Citation: Naresh Kumar, Ishu Devi, “Various Chunking and Deduduplication Techniques in Big Data,” International Journal of Scientific Research in Computer Science and Engineering, Vol.5, Issue.3, pp.129-131, 2017.
MLA Style Citation: Naresh Kumar, Ishu Devi "Various Chunking and Deduduplication Techniques in Big Data." International Journal of Scientific Research in Computer Science and Engineering 5.3 (2017): 129-131.
APA Style Citation: Naresh Kumar, Ishu Devi, (2017). Various Chunking and Deduduplication Techniques in Big Data. International Journal of Scientific Research in Computer Science and Engineering, 5(3), 129-131.
BibTex Style Citation:
@article{Kumar_2017,
author = {Naresh Kumar, Ishu Devi},
title = {Various Chunking and Deduduplication Techniques in Big Data},
journal = {International Journal of Scientific Research in Computer Science and Engineering},
issue_date = {6 2017},
volume = {5},
Issue = {3},
month = {6},
year = {2017},
issn = {2347-2693},
pages = {129-131},
url = {https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=404},
publisher = {IJCSE, Indore, INDIA},
}
RIS Style Citation:
TY - JOUR
UR - https://www.isroset.org/journal/IJSRCSE/full_paper_view.php?paper_id=404
TI - Various Chunking and Deduduplication Techniques in Big Data
T2 - International Journal of Scientific Research in Computer Science and Engineering
AU - Naresh Kumar, Ishu Devi
PY - 2017
DA - 2017/06/30
PB - IJCSE, Indore, INDIA
SP - 129-131
IS - 3
VL - 5
SN - 2347-2693
ER -
Abstract :
In today’s environment very huge amount of data is generated with duplication. This huge amount of data is called big data. To handle this kind of big data and reduce duplicity from data chunking and deduplication mechanism is used. In deduplication mechanism duplicate data is removed by using chunking and hash functions. In this paper an attempt has been made to converse different chunking and deduplication techniques. A comparative analysis of these techniques with different pros and cons has been presented.
Key-Words / Index Term :
Big Data, Chunking, Deduplication, FBC (Frequency Based Chunking) and CDC (Content Defined Chunking)
References :
[1] M. Dirk, “Advanced data deduplication techniques and their application”, Ph.D. dissertation, Universit¨ at sbibliothek Mainz, pp.1-6, 2013.
[2] M. Dirk, K. J¨urgen, B. Andre, C. Toni, K. Michael, K. Julian, “A study on data deduplication in hpc storage systems”, in Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis, USA, pp.1-7, 2012.
[3] Chi Yang, Jinjun Chen, “A Scalable Data Chunk Similarity Based Compression Approach for Efficient Big Sensing Data Processing on Cloud”, IEEE Transactions on Knowledge and Data Engineering, China, pp.1144-1157, 2017.
[4] R. Tuchinda, C. Knoblock, P. Szekely, "Building data integration queries by demonstration", Proceedings of the 12th international conference on Intelligent user interfaces, USA, pp. 170-179, 2007.
[5] Q. He, X. Zhang, Z. Li, "Data deduplication techniques", 2010 International Conference on Future Information Technology and Management Engineering (FITME), , CA, pp. 430-433, 2010.
[6] A. Banu and C. Chandrasekar, "A survey on deduplication methods", International Journal of Computer Trends and Technology, vol.3, no.3, pp. 364-368, 2012.
[7] Zhi Tang, Youjip Won, “Multithread Content Based File Chunking System in CPU GPGPU Heterogeneous Architecture”, 2011 First International Conference on Data Compression, Communications and Processing, China, pp. 58-64, 2011.
[8] Zhike Zhang, Zejun Jiang, Zhiqiang Liu, Cheng Zhang Peng, “LHS: A Nobel Method Of Information Retrieval Avoiding An Index Using Linear Hashing With Key Groups In Deduplication”, Proceedings of the 2012 International Conference on Machine Learning and Cybernetics, China, pp.1312-1318, 2012.
[9] Duane F. Shell, Leen-Kiat Soh, Vlad Chiriacescu, “Modeling Chunking Effects on Learning and Performance using the Computational-Unified Learning Model (C-ULM): A Multiagent Cognitive Process Model”, IEEE 15th International Conference on Cognitive Informatics & Cognitive Computing, India, pp. 77-85, 2016.
[10] Xingyu Zhang, Jian Zhang, “Data Deduplication Cluster Based on Similarity- Locality Approach”, IEEE International Conference on Green Computing and Communications and IEEE Internet of Things and IEEE Cyber, Physical and Social Computing, CA, pp.2168-2173, 2013.
[11] Wen Xia, Hong Jiang, Dan Feng, Lei Tian, “Combining Deduplication and Delta Compression to Achieve Low-Overhead Data Reduction on Backup Datasets”, Data Compression Conference, France, pp. 203-212, 2014.
[12] Bo Mao, Hong Jiang, Suzhen Wu, Lei Tian, “Leveraging Data Deduplication to Improve the Performance of Primary Storage Systems in the Cloud”, IEEE Transactions on Computers, NY, pp.1-14, 2015.
[13] Sonali D. Chaure, M. U. Kulkarni and Pankaj M. Jadhav, "Web based ETL Approach to Transform Relational Database to Graph Database", International Journal of Computer Sciences and Engineering, Vol.3, Issue.7, pp.92-97, 2015.
You do not have rights to view the full text article.
Please contact administration for subscription to Journal or individual article.
Mail us at support@isroset.org or view contact page for more details.