Indian Journal of Science and Technology
DOI: 10.17485/ijst/2016/v9i44/95055
Year: 2016, Volume: 9, Issue: 44, Pages: 1-9
Original Article
D. Kavitha1*, V. Kamakshi Prasad2 and J. V. R. Murthy3
1PVPSIT, Vijayawada - 520007, Andhra Pradesh, India; [email protected] 2Department of Computer Science and Engineering, JNTUH College of Engineering, JNTU Hyderabad - 500085, Telangana, India; [email protected] 3Department of Computer Science and Engineering, JNTU Kakinada, Kakinada – 533003, Andhra Pradesh, India; [email protected]
*Author for correspondence
D. Kavitha
PVPSIT, Vijayawada - 520007, Andhra Pradesh, India; [email protected]
Objectives: To mine significant subgraphs with user specified objective functions from a set of graphs that are useful for understanding the intrinsic characteristics of data in a scalable approach. Methods/Statistical Analysis: A large number of candidate subgraphs generated during mining process causes both computational and statistical problem. In this paper, Significant SubGraph Mining-SSGM proposes an algorithm to find significant subgraphs by using a small set of representative patterns - coreset that overcomes these problems. Furthermore, an edge graph notation is used to represent a graph that enables to mine patterns directly without using separate mining algorithm. Findings: The number of possible candidates is generally exponential in search space and techniques employed are mostly focussed on monotonic property. The proposed algorithm offers simple, yet efficient optimizations to significantly improve performance by pruning the search space and exploring representative graphs. It avoids enumeration of all frequent subgraphs which cause redundancy and extreme mining time. The identified coreset elements are extended that provide optimal solution patterns and adopted edge graph notation mine subgraphs directly. Application/Improvements: Experimental results shows that the proposed algorithm is effective and efficient for mining significant subgraphs in terms of computational cost, scalability and time over existing methods. The algorithm can be applied to find different types of significant patterns in a scalable manner by using any objective function according to the problem domain including support, correlation measure and feature set.
Keywords: Frequent Graphs, Objective Function, Representative Set, Statistical Significance, Subgraph Mining
Subscribe now for latest articles and news.