• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology


Indian Journal of Science and Technology

Year: 2016, Volume: 9, Issue: 37, Pages: 1-5

Original Article

Big Data Mining Techniques


Objectives: The objective of this research work is to discuss the various techniques which can be used for mining of big data viz. sampling, incremental learning, and distributed learning. Methods: For this study, literature survey was done to ­identify the various techniques employed by different authors to handle large (and streaming) data sets. For each ­technique, one or more algorithm was chosen and applied on large data sets. The platform for each technique was standardized (R libraries were used for each algorithm). The algorithms were compared on accuracy and time-consumed. Findings: The findings of this research work which conform to the existing literature is that the distributed learning is the best approach in terms of accuracy and time-complexity, for large data sets. However, if the data sets are streaming data sets and we want to perform real-time analysis then sampling or incremental approach are better than distributed approach. Incremental approach provides better accuracy, whereas sampling reduces time-complexity. Novelty: This study is important in the sense that it brings all the three techniques together on a single platform, which hasn’t been done earlier.
Keywords: Big Data, Data Mining, Distributed Learning, Incremental Learning, Sampling


Subscribe now for latest articles and news.