Indian Journal of Science and Technology
Year: 2016, Volume: 9, Issue: 8, Pages: 1-11
Porkodi Rajendran* and Deepika Thangavel
Department of Computer Science, School of Computer Science and Engineering, Bharathiar University, Coimbatore – 641 046, Tamil Nadu, India; [email protected], [email protected]
*Author for Correspondence
Porkodi Rajendran Department of Computer Science, School of Computer Science and Engineering, Bharathiar University, Coimbatore – 641 046, Tamil Nadu, India; [email protected]
Objective:The main objective of the research paper is to identify gene enrichment analysis of clustered genes from asthma microarray dataset that are clustered using k-means clustering algorithm. The enrichment analysis is used to assign biological meaning to some group of genes and in this paper the gene enrichment analysis is done using Gene Ontology terms.Method:The proposed research work consists of two-fold task: clustering gene expression profile using K-Means clustering algorithm; conducting Gene Enrichment Analysis using Gene Ontology (GO) to identify the enriched GO terms in each cluster with respect to specific set of Molecular Functions, Biological Processes and Cellular Components; Gene Ontology is used to provide external validation for the clusters to determine if the genes in a cluster belong to the specific functionalities. The most significant or enriched GO terms are extracted based on P-Value metric and the most enriched GO terms are visualized for genes in each cluster by using graph.Finding:The asthma microarray dataset contains 41,000 genes and out of that only 9,425 genes are considered for clustering process after preprocessing task. The series of preprocessing tasks certainly helpful to improve and predicts the accurate results. The experimental result of the research work produces the 3 sets of clusters and the first cluster set with 4 clusters have the number of genes 1720, 2636, 2458 and 2611 respectively. Similarly, the other two cluster sets numbers of genes are identified. The gene enrichment analysis is conducted for each cluster in the cluster set based on top most significant molecular functions GO terms enriched in each cluster. The most significant GO terms have been identified based on the P-value metric and associations among top most significant GO terms are visualized in each cluster.
Keywords:Analysis of Enriched Go Terms,Clustering in Bioinformatics, Gene Enrichment Analysis, Identify Enriched GO Terms of Genes, Microarray Gene Expression Data
Subscribe now for latest articles and news.