Indian Journal of Science and Technology

Year: 2023, Volume: 16, Issue: 37, Pages: 3050-3063

Enhancement in Stemmer Design: Natural Language Semantics Perspective

Received Date:27 March 2023, Accepted Date:17 August 2023, Published Date:30 September 2023


Objective: To enhance the performance and accuracy of the stemming process. Method: The Porters stemmer is used conventionally for removing common morphological and inflectional endings (suffixes) from the words in the English language. It uses a set of pre-defined rules that are less complex when compared to other existing stemmers. We have identified several imprecisions encountered during the stemming process and proposed solutions to remove and invalidate the same. Findings: The experiment was performed on a set of 762 words starting with characters “a”, “b”, and “c”. It was found that out of 762 words used for system validation and testing, the results of 355 words were different when stemmed with MPS [Modified Porter Stemmer], and the remaining 407 words resulted in the same stemmed word after using both stemmers. The Modified Porter Stemmer presented in the current paper with Python implementation has given better results for 46% of words. Novelty: This paper highlights the encountered errors while using the algorithm and provides solutions to enhance the performance and accuracy of the stemming process. The designed stemmer is named “Modified Porter Stemmer” [MPS].

Keywords: Natural Language Processing; Stemmer; Porter’s Stemmer; Enhancement; Stemming Process


