• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology


Indian Journal of Science and Technology

Year: 2018, Volume: 11, Issue: 3, Pages: 1-6

Original Article

Performance Analysis of Regression and Classification Models in the Prediction of Breast Cancer


Objective: To suggest an automated diagnostic system for the early detection of breast cancer. Methods: This problem has been addressed by making use of machine learning algorithms that can accurately classify a tumor as either malignant or benign by identifying the minimum number of image features. A comparative study on various classification approaches such as Decision Tree, Support Vector Machine, K-Nearest Neighbor and Random Forest have also been conducted with a focus on cross validation to identify the best performing model. Findings: The study shows that Random Forest classifier gives the maximum accuracy. It also highlights that cross validation and fine tuning are necessary to prevent over fitting of data. Improvements: It has been observed that the selection of parameters play a very important role in correct classification as multicollinearity among attributes can render classifier models ineffective.

Keywords: Breast Cancer, Classification, Cross Validation, Decision Tree, K-Nearest Neighbor, Logistic Regression, Random Forest, Support Vector Machine


Subscribe now for latest articles and news.