• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology


Indian Journal of Science and Technology

Year: 2016, Volume: 9, Issue: 48, Pages: 1-9

Original Article

Feature Selection for Microarray Data using WGCNA Based Fuzzy Forest in Map Reduce Paradigm


Objectives: The feature selection is one category of principally used information analysis algorithms on microarray information or any related to high dimensional information. The goal of the feature selection algorithms is to separate out a little set of informative options that best explains experimental variations. This work really investigates the feature selection drawback for microarray information with tiny samples and variant correlation. Most existing algorithms sometimes need expensive machine effort, particularly beneath thousands of cistron (Gene) conditions. Usually citron (Gene) selection methodology searches for associate best or close to best set of genes with relevance a given analysis. Methods: The main objective of this project is to effectively choose the foremost informative options from microarray data, whereas creating the machine expenses reasonable. This can be achieved by proposing Fuzzy Forest using Weighted Gene Correlation Network Analysis (WGCNA) that makes use of interaction between the features (or Genes). Necessary representative features (or Genes) selection measure designated from every enriched feature partition to make the reduced gene area. Findings: Finally, by shaping a correct regression context, the planned methodology are often simply implement to utilize the MapReduce paradigm, that considerably reduces machine load and additionally resulting in lower prediction error rates (OOB) with variable importance compared to different existing approaches. Thus, it is necessary to know the performance of random forest with microarray data and its potential use for gene selection. Applications: Some the major applications of WGCNA based Fuzzy Forest are genomic data analysis (including microarray data), neuroscience, bioinformatics, DNA methylation data analysis (16s rRNA gene sequencing), cancer, yeast genetics analysis, analysis of brain imaging data (functional MRI data analysis).

Keywords: Feature Selection, Fuzzy Forest, MapReduce, Microarray Data, Module Eigen Gene, Random Forest, Rhadoop, WGCNA


Subscribe now for latest articles and news.