Indian Journal of Science and Technology
Year: 2015, Volume: 8, Issue: 35, Pages: 1-4
D. Pugazhenthi* and S. Arul Vallarasi
Department of Computer Science, Quaid-E-Millath Government College for Women (Autonomous ), Anna Salai, Chennai – 600002, Tamil Nadu, India;
[email protected], [email protected]
Background/Objectives: Character recognition of the English alphabet using template matching method is a simple method to implement. This paper proposes Tamil character recognition of Bamini Tamil font using Template Matching method. Materials/Methods: The document image without skew is binarized at the preprocessing step. The preprocessed image is then segmented. Every line of text is segmented using horizontal projection analysis. Every character in a line is segmented using connected component processing. Then each character segmented is correlated with the preloaded templates of the system. The maximum correlation judges the character. In the same way, every segmented input is checked with the preloaded templates. These templates are mapped onto Tamil Unicode for recognition. The text is reconstructed using Unicode fonts and finally produces the Machine editable Unicode text in a text file. Conclusion/Findings: The system gives results considerably greater than 20 pixel base height of a character in the document image. Applications/ Improvements: The possibilities of using other fonts character recognition is applied in future. Other methods are also considered for further implementation.
Keywords: Connected Component Labeling, OCR, Offline Character Recognition, Segmentation, Template Matching
Subscribe now for latest articles and news.