Indian Journal of Science and Technology
Year: 2016, Volume: 9, Issue: 10, Pages: 1-6
S. Kalpana1* and S. Vigneshwari 2
1Department of Computer Science and Engineering, Sathyabama University, Chennai - 600119, Tamil Nadu, India; [email protected] 2Faculty of computing, Sathyabama University, Chennai - 600119, Tamil Nadu, India; [email protected]
*Author for Correspondence
S. Kalpana Department of Computer Science and Engineering, Sathyabama University, Chennai - 600119, Tamil Nadu, India; [email protected]
Objective: The main objective is to implement multi view point similarity to perform document comparisons that use the concept of clustering. Methods/Analysis: The main task of data mining is clustering which is used to group or select objects which are similar to one another. Data mining divides whole document into meaningful clusters and analyses data. There are many different types of clustering methods like hierarchical clustering, partitioned clustering and data grouping may be based on distance, viewpoints, Euclidean distance etc,, Of these, the current system uses single view point similarity. This type of single view point similarity has some disadvantages. The main disadvantage is it does not use full set of document data so that detailed comparison measures cannot be revealed. In the future system multi viewpoint similarity is used to overcome the above disadvantage. Findings: The multi view point similarity method is used to overcome the disadvantages mentioned under the analysis. This method compares similarity between the multiple documents in detailed manner. The documents have been compared line by line and show the similarity. Then we have enhanced the existing ECSMTP algorithm and it is named as ECSMTP (Enhanced Concept Based Similarity Measure for Text Processing). This algorithm categorizes data from selected documents along with weight age of document, and based on that it forms clusters and calculates the similarity measure. Further in this system different kind of documents were compared like text documents, word, PDF documents etc., but it is not in the existing system. User may select kind of document and comparisons can be made on the selected documents. Clusters were formed and these clusters were compared.
Keywords: Clustering, ECSMTP, Multiviewpoint, Pattern Recognition, Singleview Point
Subscribe now for latest articles and news.