Indian Journal of Science and Technology
DOI: 10.17485/IJST/v16i45.1781
Year: 2023, Volume: 16, Issue: 45, Pages: 4156-4163
Original Article
M Kavitha1*, M Kasthuri2
1Research Scholar/Assistant Professor, Department of Computer Applications, Bishop Heber College, Affiliated to Bharathidasan University, Trichirappalli, 620024, Tamil Nadu, India
2Associate Professor, Department of Computer Applications, Bishop Heber College, Affiliated to Bharathidasan University, Trichirappalli, 620024, Tamil Nadu, India
*Corresponding Author
Email: [email protected]
Received Date:15 July 2023, Accepted Date:08 October 2023, Published Date:05 December 2023
Background/Objectives: The goal of this study was to create an Enco-Standardization technique that would produce accurate data and improve the diagnosis of Autism Spectrum Disorder (ASD).This method uses mean values to replace missing values in a dataset and improves them by combining label encoding and conventional scaling techniques. Methods: The ASD dataset, which has 704 instances and 21 attributes, is used in this study. Training and testing are divided by the dataset (80%-20%). As an imputation strategy in this dataset, missing values are located and replaced with the mean value. Attributes are encoded using the Enco-Standardization methodology using a label encoding technique that changes non-numeric variables into numeric ones. After that, the data were scaled into a machine-readable format to standardise it. Different machine learning classifier models are compared to the hybrid strategy of encoding and scaling techniques. Based on the accuracy found using machine learning classifier models, the dataset acquired using the Enco-Standardization technique is assessed. Findings: The dataset needs to be accurate and relevant in order to increase accuracy and decrease computing time. The findings of the Enco-Standardization methodology showed a good pre-processing method with accuracy values of 98% for Naive Bayes (NB), 71% for K Nearest Neighbour (KNN), 74% for Support Vector Machine (SVM), 97% for Linear Regression (LR), 100% for Decision Tree (DT), and 100% for Random Forest (RF). The deletion of missing values improves performance in KNN (94%), SVM (95.9%), LR, DT, and RF (100%) but decreases the number of instances in the dataset, rendering the model ineffective. Novelty: The data in a dataset are transformed and encoded using the proposed Enco-Standardization pre-processing technique, which increases the precision of the data analysis process in ASD prediction. Data discrepancies are avoided by using this eco-standardization technique.
Keywords: Autism Spectrum Disorder, Preprocessing, Scaling, EncoStandardization, Machine Learning
© 2023 Kavitha & Kasthuri. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)
Subscribe now for latest articles and news.