• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology

Article

Indian Journal of Science and Technology

Year: 2016, Volume: 9, Issue: 4, Pages: 1-8

Original Article

Developing a Modified Logistic Regression Model for Diabetes Mellitus and Identifying the0 Important Factors of Type II DM

Abstract

Background/Objectives: Different methods can be applied to create predictive models for the clinical data with binary outcome variable. This research aims to explore the process of constructing the modified predictive model of Logistic Regression (LR). Method/Statistical Analysis: To improve the accuracy of prediction, the Distance based Outlier Detection (DBOD)is used for pre-processing and Bipolar Sigmoid Function calculated using Neuro based Weight Activation Function is used in Logistic Regression instead of Sigmoid Function. Datasets were collected from clinical laboratory of AR Hospital in Madurai for the three years 2012, 2013 and 2014 are used for analysis. Data pre-processing is done to avoid the existence of insignificant data in the dataset. The detected outliers, using DBOD method are treated using a method closest to the normal range. A comparative study among different distance measures likes Euclidean and Manhattan etc. are done for DBOD method. The pre-processed data finally is fed as input to the Logistic Regression model. Maximum likelihood estimation is used to fit the model. Logistic Model is built from the Sigmoid Function using the Regression Coefficients. The accuracy of the model is evaluated by 10 fold cross validation. Findings: Logistic Model is built from the Sigmoid Function using the Regression Coefficients, produces the accuracy of 79%. The Sigmoid Function calculated using Random Weight Function provides the prediction accuracy of 84.2% and the Bipolar Sigmoid Function calculated using Neuro based Weight Activation function provides the prediction accuracy of 90.4%. On comparison, Bipolar Sigmoid Function calculated using Neuro weight activation function outperforms well than the Sigmoid Function calculated using regression coefficients. Improvements/Applications: The accuracy of Logistic Regression is improved from 79% to 90.4%. The most important factors: Erythrocyte Sedimentation Rate (ESR) and Estimation of Mean blood Glucose are identified from positive subjects of Diabetes Mellitus. The analysis is done for the 31 Diabetes Disease attributes of three years dataset.

Keywords: Bipolar Sigmoid Neuro-Weight Activation Function, Distance based Outlier Detection Method, Logistic Regression, Random Weight Function, Sigmoid Activation Function, Type 2 Diabetes Risk Factors

DON'T MISS OUT!

Subscribe now for latest articles and news.