• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology

Article

Indian Journal of Science and Technology

Year: 2022, Volume: 15, Issue: 1, Pages: 19-27

Original Article

Text to Speech Synthesizer for Tigrigna Linguistic using Concatenative Based approach with LSTM model

Received Date:13 November 2021, Accepted Date:30 December 2021, Published Date:21 January 2022

Abstract

Objectives: The purpose of this study is to describe text-to-speech system for the Tigrigna language, using dialog fusion architecture and developing a prototype text-to-speech synthesizer for Tigrigna Language. Methods : The direct observation and review of articles are applied in this research paper to identify the whole strings which are represented the language. Tools used in this work are Mathlab, LPC, and python. In this paper LSTM deep learning model was applied to find out accuracy, precision, recall, and Fscore. Findings: The overall performance of the system in the word level which is evaluated by NeoSpeech tool is found to be 78% which is fruitful. When it comes to the intelligibility and naturalness of the synthesized speech in the sentence level, it is measured in MOS scale and the overall intelligibility and naturalness of the system are found to be 3.28 and 3.27 respectively. Based on the experiment LSTM Deep learning model provides an accuracy of 91.05%, the precision of 78.05%, recall of 86.59 %, and F-score of 83.05% respectively. The values of performance, intelligibility, and naturalness are inspiring and show that diphone speech units are good candidates to develop a fully functional speech synthesizer. Novelty: The researchers come up with the first text to speech LSTM deep learning model for the Tigrigna language which is critical and will be a baseline for other related research to be done for Tigrigna and other languages.

Keywords: LSTM; speech synthesis; Tigrigna syllables; TexttoSpeech; Concatenative approach

References

  1. Weiss RJ, Skerry-Ryan RJ, Battenberg E, Mariooryad S, Kingma DP. Wave-Tacotron: Spectrogram-Free End-to-End Text-to-Speech Synthesis. ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). 2021;p. 5679–5683. doi: 10.1109/ICASSP39728.2021.9413851
  2. Balyan A. An Overview on Resources for Development of Hindi Speech Synthesis System. New Ideas Concerning Science and Technology Vol. 11. 2021;11:57–63. Available from: https://doi.org/10.9734/bpi/nicst/v11/5977D
  3. Joshi MM, Agarwal S, Shaikh S, Pitale P. Text to speech synthesis for Hindi language using festival framework. International Research Journal of Engineering and Technology (IRJET). 2019;6(04):630–632. Available from: https://www.irjet.net/archives/V6/i4/IRJET-V6I4142.pdf
  4. Herbert B, Wigley G, Ens B, Billinghurst M. Cognitive load considerations for Augmented Reality in network security training. Computers & Graphics. 2021. Available from: https://dx.doi.org/10.1016/j.cag.2021.09.001
  5. Tebbi H, Hamadouche M, Azzoune H. A new hybrid approach for speech synthesis: application to the Arabic language. International Journal of Speech Technology. 2019;22(3):629–637. doi: 10.1007/s10772-018-9499-4
  6. Koc WWW, Chang YTT, Yu JYY, Ik TU. Text-to-Speech with Model Compression on Edge Devices. 2021 22nd Asia-Pacific Network Operations and Management Symposium (APNOMS). 2021;p. 114–119. doi: 10.23919/APNOMS52696.2021.9562651
  7. Gujarathi PV, Patil SR. Gaussian Filter-Based Speech Segmentation Algorithm for Gujarati Language. In: Smart Computing Techniques and Applications. (pp. 747-756) Springer Singapore. 2021.
  8. Rajendran V, Kumar GB. A Robust Syllable Centric Pronunciation Model for Tamil Text To Speech Synthesizer. IETE Journal of Research. 2019;65(5):601–612. doi: 10.1080/03772063.2018.1452642
  9. Kim C, Gowda D, Lee D, Kim J, Kumar A, Kim S, et al. A Review of On-Device Fully Neural End-to-End Automatic Speech Recognition Algorithms. 2020 54th Asilomar Conference on Signals, Systems, and Computers. 2020;p. 277–283. doi: 10.1109/IEEECONF51394.2020.9443456
  10. Kodhai SDDE. Textaloud Assistant App Development for Multilanguage. International Journal of Innovative Technology and Exploring Engineering (IJITEE). 2019;8(7s). Available from: https://www.ijitee.org/wp-content/uploads/papers/v8i7s/G10010587S19.pdf
  11. Kewley-Port D, M. Nearey T. Speech synthesizer produced voices for disabled, including Stephen Hawking. The Journal of the Acoustical Society of America. 2020;148(1):R1–R2. doi: 10.1121/10.0001490
  12. Nadig PPS, Pooja G, Kavya D, Chaithra R, Radhika AD. Survey on text-to-speech Kannada using Neural Networks. International Journal of Advance Research. 2019;5(6):128. Available from: https://www.ijariit.com/manuscripts/v5i6/V5I6-1159.pdf
  13. Narvani V, Arolkar H. Information and Communication Technology for Competitive Strategies (ICTCS 2020) Lecture Notes in Networks and Systems. 2021;190. Available from: https://doi.org/10.1007/978-981-16-0882-7_84
  14. Madhfar MAH, Qamar AM. Effective Deep Learning Models for Automatic Diacritization of Arabic Text. IEEE Access. 2021;9:273–288. doi: 10.1109/access.2020.3041676
  15. Tanberk S, Dagli V, Gurkan MK. Deep Learning for Videoconferencing: A Brief Examination of Speech to Text and Speech Synthesis. 2021 6th International Conference on Computer Science and Engineering (UBMK). 2021;p. 506–511. doi: 10.1109/UBMK52708.2021.9558954

Copyright

© 2022 Araya & Alehegn. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Published By Indian Society for Education and Environment (iSee)

DON'T MISS OUT!

Subscribe now for latest articles and news.