Indian Journal of Science and Technology
DOI: 10.17485/ijst/2014/v7i12.11
Year: 2014, Volume: 7, Issue: 12, Pages: 2007–2014
Original Article
Lipismita Panigrahi1 , Kaberi Das2 and Debahuti Mishra3*
1 Computer Science & Information Technology, Balasore College of Engineering and Technology, Balasore, Odisha, India
2 Department of Computer Applications, Institute of Technical Education and Research, Siksha O Anusandhan Deemed to be University, Bhubaneswar, Odisha, India
3 Computer Science & Engineering, Institute of Technical Education and Research, Siksha O Anusandhan Deemed to be University, Bhubaneswar, Odisha, India; debahutimishra@soauniversity.ac.in
Missing values can cause serious problems while mining data sets, such as i) loss of information and efficiency; ii) problem in data handling computation and analysis due to irregularities in the data patterns and non-applicability of standard software; and iii) serious bias if there are systematic differences between the observed and the unobserved data. Missing values can also cause misleading results by introducing bias. This paper focuses on a methodological framework for the development of an automated data imputation model based on Hybrid Higher Order Neural Network Classifier (HHONC). Four real, integer and simulated data sets are exposed to a perturbation experiment, based on the random generation of missing values. Here different imputation methods are applied in glass identification, wine recognition, heart disease and lung cancer data set to find the missing value and compared with different classic imputation procedures. The experiment not only improves the quality of a database with missing value but also the best results are clearly obtained with different variables.
Keywords: Hybrid Higher Order Neural Classifier (HHONC), Imputation Method, Missing Value, Neural Network(NN)
Subscribe now for latest articles and news.