• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology

Article

Indian Journal of Science and Technology

Year: 2021, Volume: 14, Issue: 7, Pages: 618-627

Original Article

Tri-level handwritten text segmentation techniques for Gujarati language

Received Date:20 November 2020, Accepted Date:27 February 2021, Published Date:03 March 2021

Abstract

Objectives: To improve the efficiency of tri-level segmentation tasks for handwritten Gujarati text. Methods: Using hybrid methods for tri-level segmentation, we have used line, word and character segmentation from the image. This study presents a segmentation paradigm that works with touching characters, slop of the line written on the page, character overlapping, etc. It evaluated on the dataset of 500+ images created by us on different writing sentences by different people. We have used the Horizontal projection technique for line segmentation, Scale-space technique for word segmentation and the Vertical projection technique for character segmentation. Findings: The experimental results show that the proposed method is more efficient for handwritten Gujarati text with diacritics. We have obtained the accuracy for character level segmentation is 82%, word-level is 90% and for the line-level segmentation is 87%. Novelty: We have designed a methodology to segment Gujarati handwritten text with diacritics at all three levels including characters, words and lines. Applications: We have proposed tri-level segmentation which is pre-processing task that can be used in any character recognition systems i.e. OCR.

Keywords: Deep learning; trilevel segmentation; handwritten Gujarati text

References

  1. Pareek J, Singhania D, Kumari RR, Purohit S. Gujarati Handwritten Character Recognition from Text Images. Procedia Computer Science. 2020;171:514–523. Available from: https://dx.doi.org/10.1016/j.procs.2020.04.055
  2. Bal A, Saha R. An Improved Method for Text Segmentation and Skew Normalization of Handwriting Image. Advances in Intelligent Systems and Computing. 2018;518:181–196. Available from: http://dx.doi.org/10.1016/j.procs.2016.07.227
  3. Malakar S, Sarkar R, Basu S, Kundu M, Nasipuri M. An image database of handwritten Bangla words with automatic benchmarking facilities for character segmentation algorithms. Neural Computing and Applications. 2021;33(1):449–468. Available from: https://dx.doi.org/10.1007/s00521-020-04981-w
  4. Ahmed SM, Muazzam M, Farhan A, Muhammad FK. An Efficient Segmentation Technique for Urdu Optical Character Recognizer (OCR) Springer International Publishing. 2020. Available from: http://link.springer.com/10.1007/978-3-030-12385-7
  5. Jamtsho Y, Muneesawang P. Dzongkha Word Segmentation Using Deep Learning KST 2020 - 2020. 12th International Conference on Knowledge and Smart Technology: 1-5. 2020;p. 1–5. Available from: https://doi.org/10.1109/KST48564.2020.9059451
  6. Renton G, et al. Fully Convolutional Network with Dilated Convolutions for Handwritten Text Line Segmentation. International Journal on Document Analysis and Recognition. 2018;21(3):177–186. Available from: https://doi.org/10.1007/s10032-018-0304-3

Copyright

© 2021 Rajyagor & Rakholia.This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)

DON'T MISS OUT!

Subscribe now for latest articles and news.