Innovative Feature Sets for Machine Learning based Telugu Character Recognition

J   Jyothi; K  Manjusha; M  Anand Kumar and K  P  Soman

doi:10.17485/ijst/2015/v8i24/79996

Article

Innovative Feature Sets for Machine Learning based Telugu Character Recognition

VIEWS 1006
PDF 251

Abstract
Full-Text HTML
Full-Text PDF
How to Cite

Indian Journal of Science and Technology

DOI: 10.17485/ijst/2015/v8i24/79996

Year: 2015, Volume: 8, Issue: 24, Pages: 1-7

Original Article

Innovative Feature Sets for Machine Learning based Telugu Character Recognition

J . Jyothi^* , K. Manjusha, M. Anand Kumar and K. P. Soman

Centre for Excellence in Computational Engineering and Networking, Amrita Vishwa Vidyapeetham, Coimbatore - 641 112, Tamil Nadu, India;
[email protected], [email protected], [email protected], [email protected]

This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

In this Information age, all sources of information like historic documents, books, manuscripts are digitized and are available all over the world through internet in the form of scanned copies. These scanned images contain valuable information which are available either in colour or black and white for pleasant viewing. Optical Character Recognition (OCR) technology provides facility to search for keywords in these digital copies. In this paper, new method in which building an OCR system for Telugu language script; mainly focussing on the character recognition module. Features extracted through Discrete Wavelet Transform (DWT), Projection Profile (PP) and Singular Value Decomposition (SVD) is evaluated using k-Nearest Neighbour (k-NN) and Support Vector Machine (SVM) classifiers. Most productive results are obtained from the DWT features with SVM classifiers.
Keywords: Discrete Wavelet Transform, K-Nearest Neighbour, Optical Character Recognition, Singular Value Decomposition, Support Vector Machine, Telugu Character Recognition