• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology

Article

Indian Journal of Science and Technology

Year: 2016, Volume: 9, Issue: 12, Pages: 1-11

Original Article

Maulik: A Plagiarism Detection Tool for Hindi Documents

Abstract

Objective: The objective of this paper is to present an automated plagiarism detection software tool called Maulik. There are many plagiarism detection tools available for English text. Maulik detects plagiarism in Hindi documents. Method: Maulik divides the text into n-grams and then matches it with the text present in repository as well as with documents present online. Preprocessing techniques such as stop word removal and stemming has been used. The best value of n-gram for finding out the similarity of two Hindi documents has also been found out. Cosine similarity has been used for finding the similarity score. Findings: Similarity score of 96.3 has been achieved which is higher as compared to the existing Hindi plagiarism detection tools such as Plagiarism checker, Plagiarism finder, Plagiarisma, Dupli checker, Quetext. These tools compared only exact matches ignoring the language specific constraints whereas Maulik is capable of finding plagiarism if root of a word is used or a word is replaced by its synonyms. Application: Maulik is a software tool which discourages plagiarism as well as motivates the writing skills of people.

Keywords: Cosine Similarity, Plagiarism, Stemming, Stop Word, Synonyms

DON'T MISS OUT!

Subscribe now for latest articles and news.