• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology


Indian Journal of Science and Technology

Year: 2020, Volume: 13, Issue: 42, Pages: 4396-4406

Original Article

A search space enhanced modified whale optimization algorithm for feature selection in large-scale microarray datasets

Received Date:19 September 2020, Accepted Date:28 November 2020, Published Date:04 December 2020


Objectives: To enhance the microarray data classification accuracy, to accelerate the convergence speed of classifier, and Modified Whale Optimization Algorithm (MWOA), refine the best balance among local exploitation and global exploration, a Search space enhanced Modified Whale Optimization Algorithm (SMWOA) is the proposed task. Methods: The SMWOA selects the optimal features stands on the Levy flight method and quadratic interpolation method. Levy flight which employs for acceleration convergence speed of SMWOA andalso holds the result from local optima builds up by the population assortment.A quadratic interpolation takes up the exploitation stage for deeper searching  within the search area. Finding: In addition to this, a self-adaptive control parameter is introduced to make a clear variation to the solution quality. Itrefines the best equity among the local exploitation method by global exploration method. After selection of features, those are processed in Naïve Bayes (NB), Support Vector Machine (SVM), K-Nearest Neighbor (KNN), and Artificial Neural Network (ANN) classifiers for cancer detection. Novelty: The classification accuracy is improved by processing the most discriminative features in the classifiers. The overall accuracy, specificity, sensitivity, F1-score and average error of SMWOA-ANN are 6.7%, 5.6%, 7.3% and 5.6% greater than MWOA-ANN respectively for cancer detection.

Keywords: Gene expression data; dimensionality reduction; feature selection; modified whale optimization algorithm (MWOA); search space enhanced modified whale optimization algorithm (WOA)


  1. Aziz R, Verma CK, Srivastava N. Dimension reduction methods for microarray data: a review. AIMS. Bioengineering. 2017;4(1):179–197. Available from: https://doi.org/10.3934/bioeng.2017.2.179
  2. Peng Y, Wu Z, Jiang J. A novel feature selection approach for biomedical data classification. Journal of Biomedical Informatics. 2010;43(1):15–23. Available from: https://doi.org/10.1016/j.jbi.2009.07.008
  3. Kumar M, Rath NK, Swain A, Rath SK. Feature selection and classification of microarray data using MapReduce based ANOVA and K-Nearest neighbor. Procedia Computer Science. 2015;54:301–310. Available from: https://doi.org/10.1016/j.procs.2015.06.035
  4. Elsebakhi E, Asparouhov O, RAA. Novel incremental ranking framework for biomedical data analytics and dimensionality reduction: Big data challenges and opportunities. Journal of Computer Science & Systems Biology. 2015;8(4):203–214. Available from: https://doi.org/10.4172/jcsb.1000190
  5. Yang CH, Chuang LY, Yang CH. IG-GA: a hybrid filter/wrapper method for feature selection of microarray data. Journal of Medical and Biological Engineering. 2010;30(1):23–28.
  6. Dashtban M, Balafar M, Suravajhala P. Gene selection for tumor classification using a novel bio-inspired multi-objective approach. Genomics. 2018;110(1):10–17. Available from: https://doi.org/10.1016/j.ygeno.2017.07.010
  7. Gao W, Hu L, Zhang P, Wang F. Feature selection by integrating two groups of feature evaluation criteria. Expert Systems with Applications. 2018;110:11–19. Available from: https://doi.org/10.1016/j.eswa.2018.05.029
  8. Sathya M, Priya M, S. PSO search-based feature selection method for high dimensional data. International Journal of Recent Technology and Engineering (IJRTE). 2019;7(583):485–488.
  9. Sahu PR, Hota PK, Panda S. Modified whale optimization algorithm for coordinated design of fuzzy lead‐lag structure‐based SSSC controller and power system stabilizer. International Transactions on Electrical Energy Systems. 2019;29(4):e2797. Available from: https://dx.doi.org/10.1002/etep.2797
  10. Nasiri J, Khiyabani FM. A whale optimization algorithm (WOA) approach for clustering. Cogent Mathematics & Statistics. 2018;5(1). Available from: https://doi.org/10.1080/25742558.2018.1483565
  11. Sathya M, Priya M, S. Modified whale optimization algorithm for feature selection algorithm in microarray cancer datasets. International Journal of Scientific & Technology Research. 2020;1(1).
  12. Sahu B, Dehuri S, Jagadev AK. Feature selection model based on clustering and ranking in pipeline for microarray data. Informatics in Medicine Unlocked. 2017;9:107–122. Available from: https://dx.doi.org/10.1016/j.imu.2017.07.004
  13. Medjahed SA, Saadi TA, Benyettou A, Ouali M. Kernel-based learning and feature selection analysis for cancer diagnosis. Applied Soft Computing. 2017;51:39–48. Available from: https://dx.doi.org/10.1016/j.asoc.2016.12.010
  14. Prasad Y, Biswas KK, Hanmandlu M. A recursive PSO scheme for gene selection in microarray data. Applied Soft Computing. 2018;71:213–225. Available from: https://doi.org/10.1016/j.asoc.2018.06.019
  15. Ca B, Zhao J, Yang P, Yang P, Li X, Qi J, et al. Multiobjective feature selection for microarray data via distributed parallel algorithms. Future Generation Computer Systems. 2019;100:952–981. Available from: https://doi.org/10.1016/j.future.2019.02.030
  16. Momenzadeh M, Sehhati M, Rabbani H. A novel feature selection method for microarray data classification based on hidden Markov model. Journal of Biomedical Informatics. 2019;95. Available from: https://dx.doi.org/10.1016/j.jbi.2019.103213
  17. Gharehchopogh FS, Khaze SR, Maleki I. A New Approach in Bloggers Classification with Hybrid of K-Nearest Neighbor and Artificial Neural Network Algorithms. Indian Journal of Science and Technology. 2015;8(3):237–246. Available from: https://doi.org/10.17485/ijst/2015/v8i3/59570
  18. Potharaju SP, Sreedevi M. Distributed feature selection (DFS) strategy for microarray gene expression data to improve the classification performance. Clinical Epidemiology and Global Health. 2019;7(2):171–176. doi: 10.1016/j.cegh.2018.04.001
  19. Bonab MS, Ghaffari A, Gharehchopogh FS, Alemi P. A wrapper-based feature selection for improving performance of intrusion detection systems. International Journal of Communication Systems. 2020;33(12). Available from: https://dx.doi.org/10.1002/dac.4434
  20. Rahnema N, Gharehchopogh FS. An improved artificial bee colony algorithm based on whale optimization algorithm for data clustering. Multimedia Tools and Applications. 2020;79(43-44):32169–32194. Available from: https://dx.doi.org/10.1007/s11042-020-09639-2


© 2020 Sathya & Manju Priya.This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)


Subscribe now for latest articles and news.