An Integrated Algorithm for Dimension Reduction and Classification Applied to Microarray Data of Neuromuscular Dystrophies

Divya; Babita Pandey    and Devendra K  Pandey

doi:10.17485/ijst/2016/v9i28/98378

Article

An Integrated Algorithm for Dimension Reduction and Classification Applied to Microarray Data of Neuromuscular Dystrophies

VIEWS 696
PDF 254

Abstract
Full-Text HTML
Full-Text PDF
How to Cite

Indian Journal of Science and Technology

DOI: 10.17485/ijst/2016/v9i28/98378

Year: 2016, Volume: 9, Issue: 28, Pages: 1-6

Original Article

An Integrated Algorithm for Dimension Reduction and Classification Applied to Microarray Data of Neuromuscular Dystrophies

Divya¹ , Babita Pandey^{2 *} and Devendra K. Pandey³

¹School of Computer Science and Engineering, [email protected]
² School of Computer Applications, [email protected]
³ School of Biosciences, [email protected]
*Author for correspondence
Babita Pandey
School of Computer Applications,
Email: [email protected]

This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

Background/Objectives: Microarray technology allows the neuromuscular dystrophy to be predicted using gene expression patterns. Microarray gene expression data suffer from curse of high dimensionality i.e. tens of thousands of genes and few samples. So, it is necessitate reducing the dimension for accurate diagnosis. Methods/Statistical Analysis: Firstly, five-fold cross validation technique is applied to generate random results. Two feature selection techniques i.e. t-test and entropy are employed to select the genes. K-nearest neighbor and linear support vector machine are deployed for classification of diseased samples with the help of ranked genes. The performance of these integrated techniques is tested on the microarray dataset of neuromuscular dystrophies i.e. Juvenile Dermatomyositis (JDM) and Fascioscapulohumeral Muscular Dystrophy (FSHD). Findings: Effective disease specific genes are selected from thousand of genes. The value of various performance measures shows that the integration of entropy with k-nearest neighbor has outperformed on both datasets. It has given 89.47% accuracy on JDM dataset and 100% accuracy on FSHD dataset. The integration of these methods is first time application on these two diseases datasets. It can be applied on other neuromuscular disorder datasets as well.
Keywords: Dimension Reduction, Entropy, K-Fold Validation, K-Nearest Neighbor, Neuromuscular Dystrophy, Support Vector Machine