An Efficient Text Pattern Matching Algorithm for Retrieving Information from Desktop

R  Janani  and S  Vijayarani

doi:10.17485/ijst/2016/v9i43/95454

Article

An Efficient Text Pattern Matching Algorithm for Retrieving Information from Desktop

VIEWS 1607
PDF 563

Abstract
Full-Text HTML
Full-Text PDF
How to Cite

Indian Journal of Science and Technology

DOI: 10.17485/ijst/2016/v9i43/95454

Year: 2016, Volume: 9, Issue: 43, Pages: 1-11

Original Article

An Efficient Text Pattern Matching Algorithm for Retrieving Information from Desktop

R. Janani^* and S. Vijayarani

Department of Computer Science, School of Computer Science and Engineering, Bharathiar University, Coimbatore - 641046, Tamil Nadu, India; [email protected], [email protected]

*Author for correspondence
R. Janani
Department of Computer Science, School of Computer Science and Engineering, Bharathiar University, Coimbatore - 641046, Tamil Nadu, India; [email protected]

This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

Objectives: To retrieve the information after analyzing the contents of the documents which are stored in the desktop by applying string matching algorithms. Methods/Statistical Analysis: To analyze the content of the documents, the various pattern matching algorithms are used to find all the occurrences of a limited set of patterns within an input text or input document. In order to perform this task, this research work used four existing string matching algorithms; they are Brute Force algorithm, Knuth-Morris-Pratt algorithm (KMP), Boyer Moore algorithm and Rabin Karp algorithm. This work also proposes three new string matching algorithms. They are Enhanced Boyer Moore algorithm, Enhanced Rabin Karp algorithm and Enhanced Knuth-Morris-Pratt algorithm. Findings: For experimentation, this work has used two types of documents, i.e. .txt and .docx. Performance measures used are search time, number of iterations and accuracy. From the experimental results, it is realized that the enhanced KMP algorithm gives better accuracy compared to other string matching algorithms. Application/Improvements: Normally, these algorithms are used in the field of text mining, document classification, content analysis and plagiarism detection. In future, these algorithms have to be enhanced to improve their performance and the various types of documents will be used for experimentation.

Keywords: Brute Force, Boyer Moore, Information Retrieval, Knuth-Morris-Pratt, Pattern Matching, Rabin Karp