• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology


Indian Journal of Science and Technology

Year: 2015, Volume: 8, Issue: 14, Pages: 1-11

Original Article

Mining the Amino Acid Dominance in Gene Sequences


In the recent period, the classification techniques are widely applied in the field of Bioinformatics. The proposed Amino Acid Component based Classification algorithm adopts Iterative Dichotomiser3 classifier. The algorithm consists of two phases viz. attribute selection and component based classification. In the attribute selection phase the dominating amino acids and deficiencies in amino acids that cause the diseases are found. The second phase finds the components of amino acids which spread the diseases in the specified sequence. The experiments were carried out on the gene sequence of dengue virus which is available on the NCBI online biological database and the accuracy of the proposed algorithm is calculated as 90.744%. The proposed classification algorithm is compared with the traditional benchmark algorithms such as Naive Bayes, ID3, Random Forest, Multilayer Perceptron and J48. The result of this work can be used by the drug designers to predict new viral diseases.

Keywords: Amino Acid Components, Classification, Entropy, Information Gain


Subscribe now for latest articles and news.