Indian Journal of Science and Technology
Year: 2020, Volume: 13, Issue: 35, Pages: 3652-3663
P Ravi1,2,3*, C Naveena1,3, Y H Sharathkumar4,3
1Department of Computer Science and Engineering, SJB Institute of Technology, Bengaluru, 560060, Karnataka, India. Tel.: +919480507409
2Department of Computer Science and Engineering, Vidyavardhaka College of Engineering,Mysuru, 570002, Karnataka, India
3Affiliated to Visvesvaraya Technological University, Belagavi, 590018, Karnataka, India
4Department of Information Science and Engineering, Maharaja Institute of Technology,Mysuru, 571438, Karnataka, India
Email: [email protected]
Received Date:31 July 2020, Accepted Date:06 September 2020, Published Date:03 October 2020
Motivation: In India, the Language Kannada is an ancient and official language in Karnataka State. The study of ancient Kannada scripts from stone carvings, leaf, metal, cloth, paper and other sources enhances our knowledge on the traditions and culture practiced in Karnataka. Due to Poor Quality, variability and the contrast, the Kannada ancient scripts become very challenging to extract the information or to recognize the characters. Objectives: To design a suitable Optical Character Recognition (OCR) technique to read ancient Kannada scripts. Method: Clustering by fast search and find of density peaks is a state-of-the-art density-based clustering algorithm that can effectively find clusters with arbitrary shapes. However, it requires to calculate the distances between all the points in a data set to determine the density and separation of each point. Consequently, its computational cost is extremely high in the case of large-scale data sets. In this work the given document is preprocessed. The features alike SIFT and SURF are extracted and clustered using K-Means clustering. The similarity is computed using different measures. Findings: The classification accuracy was studied under different clustering methods like Kmeans, Agglomerative, Density based clustering with distance based measures like Euclidean and Manhattan. To evaluate the performance of the proposed method, we created our own database of Ashok, Kadamba, Hoysala and Mysuru scripts and experiment was conducted in a database of 4 classes under 70, 50 and 30 different training models from each class. Novelty: We propose a K-means clustering using SIFT and SURF for Kannada ancient manuscript. Experiment was conducted in our own database to validate the performance of the presented system
Keywords: Historical Kannada; Karnataka; SIFT; SURF; KMeans
© 2020 Ravi et al.This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee).
Subscribe now for latest articles and news.