Indian Journal of Science and Technology
Year: 2016, Volume: 9, Issue: Special Issue 1, Pages: 1-8
S. Jacophine Susmi1 *, H. Khanna Nehemiah1 and A. Kannan2
1 Ramanujan Computing Centre, Anna University, Chennai – 600025, Tamil Nadu, India; [email protected]
2 Department of Information Science and Technology, Anna University, Chennai – 600025, Tamil Nadu, India; [email protected]
*Author for correspondence
Ramanujan Computing Centre
Email: [email protected]
Background/Objectives: This paper presents a hybrid framework for classification of leukemia gene expression data. The framework used in this work consists of three subsystems, namely, class based dimension reduction subsystem, feature selection subsystem and classification subsystem. Methods/Statistical Analysis: This work uses class based dimension reduction techniques by employing PCA and Canonical Correlation Analysis (CCA) to the leukemia gene expression dataset. Acute Lymphoblastic Leukemia (ALL) class is subjected to Principal Component Analysis (PCA) and Acute Myeloid Leukemia (AML) class to CCA thus obtaining dimension reduced data. The feature selection subsystem uses Genetic Algorithm (GA) to select an optimal subset of informative genes. The classification subsystem utilizes these informative genes to train the NN and the classifier is obtained. Findings: The performance of the hybrid framework, GA-PCA and CCA, is analyzed and compared with that of single dimension reduction techniques, namely, GA-PCA and GA-CCA. The experimental results show that the proposed framework achieved accuracy of 88.23%. The sensitivity of the system is 85% and specificity of the system is 92.85%. This aids in determining the informative genes that are relevant to leukemia gene expression data. Applications/Improvements: The classification accuracy of GA-PCA and CCA has shown improvement when compared to that of single dimension reduction technique. Hence, combining more than one method yields higher classification accuracy and aids in identification of new classes.
Keywords: Cancer Classification, Canonical Correlation Analysis (CCA), Dimensionality Reduction (DR), Genetic Algorithm (GA), Neural Network (NN), Principal Component Analysis (PCA)
Subscribe now for latest articles and news.