Year: 2015, Volume: 8, Issue: 24, Pages: 1-7

Original Article

Randomized Kernel Approach for Named Entity Recognition in Tamil


In this paper, we present a new approach for Named Entity Recognition (NER) in Tamil language using Random Kitchen Sink algorithm. Named Entity recognition is the process of identification of Named Entities (NEs) from the text. It involves the identifying and classifying predefined categories such as person, location, organization etc. A lot of work has been done in the field of Named Entity Recognition for English language and Indian languages using various machine learning approaches. In this work, we implement the NER system for Tamil using Random Kitchen Sink algorithm which is a statistical and supervised approach. The NER system is also implemented using Support Vector Machine (SVM) and Conditional Random Field (CRF). The overall performance of the NER system was evaluated as 86.61% for RKS, 81.62% for SVM and 87.21% for CRF. Additional results have been taken in SVM and CRF by increasing the corpus size and the performance are evaluated as 86.06% and 87.20% respectively.
Keywords: Conditional Random Field (CRF), Named Entities (NEs), Named Entity Recognition (NER), Natural Language Processing (NLP), Random Kitchen Sink (RKS), Support Vector Machine (SVM) 


