Diagnosing Diabetic Dataset using Hadoop and K-means Clustering Techniques

K  Sharmila  and S  A  Vetha Manickam

doi:10.17485/ijst/2016/v9i40/101618

Article

Diagnosing Diabetic Dataset using Hadoop and K-means Clustering Techniques

VIEWS 1207
PDF 322

Abstract
Full-Text HTML
Full-Text PDF
How to Cite

Indian Journal of Science and Technology

DOI: 10.17485/ijst/2016/v9i40/101618

Year: 2016, Volume: 9, Issue: 40, Pages: 1-5

Original Article

Diagnosing Diabetic Dataset using Hadoop and K-means Clustering Techniques

K. Sharmila^* and S. A. Vetha Manickam

Department of Computer Science, Vels University, Chennai - 600117, Tamil Nadu, India; [email protected]
[email protected]
*Author for correspondence
Sharmila
Department of Computer Science
Email:[email protected]

This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

Objectives: The articles display how enormous measure of information in the field of social insurance frameworks can be dissected utilizing grouping method. Removing helpful data from this gigantic measure of information is profoundly compound, exorbitant, and tedious, in such territory information mining can assume a key part. Specifically, the standard information digging calculations for the examination of colossal information volumes can be parallelized for speedier preparing. Methods/Statistical Analysis: This paper concentrate on how grouping calculation to be specific K-means can be utilized as a part of parallel handling stage in particular Apache Hadoop bunch (MapReduce paradigm huge) so as to dissect the gigantic information quicker. Findings: As an early point, we complete examination keeping in mind the end goal to evaluate the adequacy of the parallel preparing stages as far as execution. Applications/Improvements: Based on the final result, it shows that Apache Hadoop with K-means cluster is a promising example for versatile execution to anticipate and analyze the diabetic infections from huge measure of information. The proposed work will give an insight about the big data prediction of diabetic dataset through Hadoop. In future this technology has to be extended on cloud so as to connect various geographic districts around Tamil Nadu to predict diabetic related diseases.
Keywords: Apache Hadoop, K-means, MapReduce