Indian Journal of Science and Technology
Year: 2022, Volume: 15, Issue: 27, Pages: 1364-1371
Original Article
Deepjyoti Kalita1*, Khurshid Alam Borbora2, Dipen Nath3
1Dept. of Computer Science & IT, Mangaldai College, Assam, India
2Dept of Computer Science, IDOL, Gauhati University, Assam, India
3Dept. of Computer Science, Kokrajhar Govt. College, Kokrajhar, Assam, India
*Corresponding Author
Email: [email protected]
Received Date:22 March 2022, Accepted Date:06 June 2022, Published Date:18 July 2022
Objectives : The proposed method is based on a unique technique of Deep learning for identifying spoken words with reference to Assamese language. Most of the DNN based algorithms have been successfully implemented in the field of image recognition, computer vision, natural language processing and medical picture analysis. Methods: The method used here is the Bidirectional Long Short Term Memory (BLSTM). BLSTM incorporates both past and future situations together. The speech database for this research work is hired from the repository of Indian Language Technology Proliferation and Development Center (ILTP-DC). This repository contains 32,335 utterances by 1000 numbers of male and female participants, which is comprised of 262 unique Assamese native words. The BLSTM based recognition model is using 10 out of the 262 unique words and the remaining words are used in construction or generation of synthesized sentences. The feature extraction module uses 39 feature coefficients, which are composed of MFCC, ΔMFCC and ΔΔMFCC coefficients. Findings: The Word Error Rate (WER) of the BLSTM based recognition model is 18.84% with an average accuracy of 98.12%, which sets one promising benchmark when compared to recent findings. Novelty: In this work an attempt has been made with a different approach to detect certain keywords of Assamese language by adopting deep learning methodology. The future objective of this proposed work is to improve the detection capability of this model by considering multiple DNN models together in a hybrid approach along with the inclusion of additional features.
Keywords: Bidirectional Long Short Term Memory; Deep Learning; Speech recognition; WER; MFCC
© 2022 Kalita et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Published By Indian Society for Education and Environment (iSee)
Subscribe now for latest articles and news.