• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology

Article

Indian Journal of Science and Technology

Year: 2016, Volume: 9, Issue: 10, Pages: 1-6

Original Article

Selecting Multiview Point Similarity from Different Methods of Similarity Measure to Perform Document Comparison

Abstract

Objective: The main objective is to implement multi view point similarity to perform document comparisons that use the concept of clustering. Methods/Analysis: The main task of data mining is clustering which is used to group or select objects which are similar to one another. Data mining divides whole document into meaningful clusters and analyses data. There are many different types of clustering methods like hierarchical clustering, partitioned clustering and data grouping may be based on distance, viewpoints, Euclidean distance etc,, Of these, the current system uses single view point similarity. This type of single view point similarity has some disadvantages. The main disadvantage is it does not use full set of document data so that detailed comparison measures cannot be revealed. In the future system multi viewpoint similarity is used to overcome the above disadvantage. Findings: The multi view point similarity method is used to overcome the disadvantages mentioned under the analysis. This method compares similarity between the multiple documents in detailed manner. The documents have been compared line by line and show the similarity. Then we have enhanced the existing ECSMTP algorithm and it is named as ECSMTP (Enhanced Concept Based Similarity Measure for Text Processing). This algorithm categorizes data from selected documents along with weight age of document, and based on that it forms clusters and calculates the similarity measure. Further in this system different kind of documents were compared like text documents, word, PDF documents etc., but it is not in the existing system. User may select kind of document and comparisons can be made on the selected documents. Clusters were formed and these clusters were compared. 

Keywords: Clustering, ECSMTP, Multiviewpoint, Pattern Recognition, Singleview Point

DON'T MISS OUT!

Subscribe now for latest articles and news.