Indian Journal of Science and Technology
Year: 2015, Volume: 8, Issue: 27, Pages: 1-5
Rajeev Puri*, R. P. S. Bedi and Vishal Goyal
Punjab Technical University, Kapurthala Road, Jalandhar - 144601, Punjab, India; [email protected]
Stemming is used as a pre-processing phase in the information retrieval tasks. The stemming process produces linguistically normalized text, which helps in improving the results of information retrieval tasks. In this paper, a revised suffix removal approach with extended set of stripping rules has been discussed for creating a Punjabi language Stemming tool. The stemming algorithm discussed in this paper uses regular expressions for finding suffix matches. The WordNet* database is used here for improving the stemming results.
Keywords: Brute Force, Rule Based Stemming, Punjabi Stemmer, Suffix Stripping, WordNet .
Subscribe now for latest articles and news.