Indian Journal of Science and Technology
Year: 2015, Volume: 8, Issue: 35, Pages: 1-13
B. V. Sumana1* and T. Santhanam2
1 Department of Computer Science, Vijaya College, Jayanagar, Bangalore - 560011, Karnataka, India; [email protected]
2 Department of Computer Applications, DG Vaishnav College, Chennai - 600106, Tamil Nadu, India; [email protected]
Background/Objectives: Since more than a decade, Ensemble methods like Bagging and Boosting have drawn great attention by the researchers aiming to improve the prediction accuracy over single classifiers Despite Some recent studies have noticed that Bagging and Boosting does not always improve the accuracy, it enhances the accuracy only if the classifier is unstable classifier. To overcome this problem, a Hybrid Ensemble Model with two phases of preprocessing is proposed in this paper and evaluated using 9 classifiers on 3 benchmark data sets of UCI Repository. Methods: In the first phase of preprocessing feature selection is performed using CFS to select the attributes highly correlated to the class and in the second phase K-means clustering algorithm is applied to remove the incorrectly classified instances. Finally, the resultant instances from the previous stages are trained with Bagging and Boosting ensembles to build the final Hybrid Ensemble classifier Model (HECM) using 10 fold cross validation. The result was evaluated using confusion matrix and performance measures like accuracy, kappa, mean absolute error and time to build the model. Findings: Results proved that proposed model is more efficient than the existing models and showed improved accuracy for both stable and unstable classifier ranging from 2% to 30.14% over traditional ensemble model depending upon the complexity of the algorithm.
Keywords: Bagging, Boosting, Classification, Correlation Based Feature Selection (CFS), Hybrid, K-Means
Subscribe now for latest articles and news.