• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology

Article

Indian Journal of Science and Technology

Year: 2022, Volume: 15, Issue: 45, Pages: 2476-2481

Original Article

Development of Small Vocabulary Continuous Speech-to-Text System for Kannada Language/Dialects

Received Date:19 September 2022, Accepted Date:30 October 2022, Published Date:07 December 2022

Abstract

Objectives: To develop a speech-to-text (STT) system using Kaldi speech recognition toolkit for continuous Kannada language/dialects. Methods: A continuous Kannada speech data is collected from 100 speakers/farmers of Karnataka state in field. The lexicon/dictionary and set of phonemes for Kannada language/dialects are created and transcribed the collected speech data using transcriber tool. The ASR models are developed at different phoneme levels using Kaldi. Findings: In this work, an effort is made to develop a robust small vocabulary STT system for continuous Kannada language using Kaldi. The various acoustic modelling techniques are used to develop a robust ASR model and achieved a word error rate (WER) of 0.23%. The performance of the developed ASR model is compared with existing works and analyzed by offline speech recognition. Novelty: Many STT systems have been developed for Indian and International languages/dialects, but not for Kannada language. This work is first of its kind using Kaldi in Kannada language under the constraints of limited data. The developed ASR model could be used further in the development of end-to-end ASR system for speech processing applications.

Keywords: Automatic Speech Recognition (ASR); Word Error Rate (WER); Continuous Kannada Speech Data; Kannada Language/Dialects; Lexicon

References

  1. Rabiner LR. Applications of voice processing to telecommunications. Proceedings of the IEEE. 1994;82(2):199–228. Available from: https://doi:10.1109/5.265347
  2. Wachter MD, Matton M, Demuynck K, Wambacq P, Cools R, Compernolle DV. Template-Based Continuous Speech Recognition. IEEE Transactions on Audio, Speech and Language Processing. 2007;15(4):1377–1390. Available from: https://doi:10.1109/TASL.2007.894524
  3. Triefenbach F, Demuynck K, Martens JPP. Large Vocabulary Continuous Speech Recognition With Reservoir-Based Acoustic Models. IEEE Signal Processing Letters. 2014;21(3):311–315. Available from: https://doi:10.1109/LSP.2014.2302080
  4. Su R, Liu X, Wang L. Automatic Complexity Control of Generalized Variable Parameter HMMs for Noise Robust Speech Recognition. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2015;23(1):1. Available from: https://doi:10.1109/TASLP.2014.2372901
  5. Dimitriadis D, Bocchieri E. Use of Micro-Modulation Features in Large Vocabulary Continuous Speech Recognition Tasks. IEEE/ACM Transactions on Audio, Speech, and Language Processing. 2015;23(8):1348–1357. Available from: https://doi:10.1109/TASLP.2015.2430815
  6. Ganapathy S. Multivariate Autoregressive Spectrogram Modeling for Noisy Speech Recognition. IEEE Signal Processing Letters. 2017;24(9):1373–1377. Available from: https://doi:10.1109/LSP.2017.2724561
  7. Afouras T, Chung JS, Senior A, Vinyals O, Zisserman A. Deep Audio-Visual Speech Recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence. 2022;44(12):8717–8727. Available from: https://doi:10.1109/TPAMI.2018.2889052
  8. Deng L, Li X. Machine Learning Paradigms for Speech Recognition: An Overview. IEEE Transactions on Audio, Speech, and Language Processing. 2013;21(5):1060–1089. Available from: https://doi:10.1109/TASL.2013.2244083
  9. Furui S, Kikuchi T, Shinnaka Y, Hori C. Speech-to-Text and Speech-to-Speech Summarization of Spontaneous Speech. IEEE Transactions on Speech and Audio Processing. 2004;12(4):401–408. Available from: https://doi:10.1109/TSA.2004.828699
  10. Yadava TG, Jayanna HS. A spoken query system for the agricultural commodity prices and weather information access in Kannada language. International Journal of Speech Technology. 2017;20(3):635–644. Available from: https://doi.org/10.1007/s10772-017-9428-y
  11. Yadava TG, Jayanna HS. Speech enhancement by combining spectral subtraction and minimum mean square error-spectrum power estimator based on zero crossing. International Journal of Speech Technology. 2019;22(3):639–648. Available from: https://doi.org/10.1007/s10772-018-9506-9
  12. Yadava TG, Jayanna HS. Enhancements in automatic Kannada speech recognition system by background noise elimination and alternate acoustic modelling. International Journal of Speech Technology. 2020;23(1):149–167. Available from: https://doi.org/10.1007/s10772-020-09671-5
  13. Kumar P, Yadava PS, Jayanna TG, HS. Continuous Kannada speech recognition system under degraded condition. Circuits, Systems, and Signal Processing. 2020;39:391–419. Available from: https://doi.org/10.1007/s00034-019-01189-9
  14. Yadava GT, Nagaraja BG, Jayanna HS. Enhancements in Continuous Kannada ASR System by Background Noise Elimination. Circuits, Systems, and Signal Processing. 2022;41(7):4041–4067. Available from: https://doi.org/10.1007/s00034-022-01973-0
  15. Yadava TG, Nagaraja BG, Jayanna HS. A spatial procedure to spectral subtraction for speech enhancement. Multimedia tools and applications. 2022;81:23633–23647. Available from: https://doi.org/10.1007/s11042-022-12152-3
  16. Louis J, Fendji KE, Tala DCM, Blaise O, Marcellin AY&. Automatic Speech Recognition Using Limited Vocabulary: A Survey. Applied Artificial Intelligence. 2022;36(1). Available from: https://doi.org/10.1080/08839514.2022.2095039
  17. Thalengala A, Hoblidar A, Girisha S, Tumkur. Effect of time-domain windowing on isolated speech recognition system performance. International Journal of Electronics and Telecommunications. 2022(1):161–166. Available from: https://doi.org/10.24425/ijet.2022.139856
  18. Kumar P, Jayanna HS. Development of Speaker-Independent Automatic Speech Recognition System for Kannada Language. Indian Journal of Science and Technology. 2022;15(8):333–342. Available from: https://doi.org/10.17485/IJST/v15i8.2322

Copyright

© 2022 Yadava et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)

DON'T MISS OUT!

Subscribe now for latest articles and news.