Indian Journal of Science and Technology
Year: 2017, Volume: 10, Issue: 30, Pages: 1-6
Parneeta Dhaliwal1 * and M. P. S. Bhatia2
1Department of CSE, School of Engineering and Technology, K. R. Mangalam University, Gurgaon – 122103, Haryana, India; [email protected] 2Division of CoE, Netaji Subhas Institute of Technology, Dwarka, New Delhi - 110078, India; [email protected]
*Author for the correspondence:
Department of CSE, School of Engineering and Technology, K. R. Mangalam University, Gurgaon – 122103, Haryana, India; [email protected]
Background: Nowadays, many applications involve huge amounts of data with variations in underlying concept. This large data needs to be handled with high accuracy, even in a resource-constrained environment. Objectives: In order to achieve better generalization accuracy while handling data with drifting concepts mainly recurrent drifts, we proposed an ensemble system called Recurring Dynamic Weighted Majority (RDWM). Methods: Our system maintains a primary online ensemble consisting of experts that represent the present concepts and a secondary ensemble that maintains experts representing the old concepts, since the beginning of learning. An effective pruning methodology helps to remove redundant and old classifiers from the system. Findings: Experimental analysis using Stagger dataset shows that our system proves to be the best system for handling dataset containing abrupt as well as recurrent drifts, achieving the best prequential accuracy using an optimal window size. RDWM proves to be highly resource effective as compared to EDDM approach. Experimental evaluation using a real world electricity pricing dataset proves RDWM to be the best system, performing very accurately even in a resource-constrained environment. Improvements: We can further enhance our system to handle novelty detection in data streams.
Keywords: Concept Drift, Data Streams, Recurring, Recurring Concept
Subscribe now for latest articles and news.