Improved Support Vector-Recurrent Neural Network with Optimal Feature Selection-based Spoken Language Identification System

Irshad Ahmad Thukroo; Rumaan Bashir; J Kaiser Giri

doi:10.17485/IJST/v16i10.2119

Article

Improved Support Vector-Recurrent Neural Network with Optimal Feature Selection-based Spoken Language Identification System

VIEWS 961
PDF 736

Indian Journal of Science and Technology

DOI: 10.17485/IJST/v16i10.2119

Year: 2023, Volume: 16, Issue: 10, Pages: 680-697

Original Article

Improved Support Vector-Recurrent Neural Network with Optimal Feature Selection-based Spoken Language Identification System

Irshad Ahmad Thukroo¹, Rumaan Bashir^2*, J Kaiser Giri²

¹Research Scholar, Department of Computer Science Islamic University of Science & Technology, Kashmir
²Associate Professor, Department of Computer Science Islamic University of Science & Technology, Kashmir

*Corresponding Author
Email: [email protected]

Received Date:02 November 2022, Accepted Date:09 December 2022, Published Date:07 March 2023

This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

Objective: Spoken language identification being the fore-front of language recognition tasks and most significant medium of communication has to be enhanced in order to improve the accuracy of recently developed spoken language recognition systems. The purpose of this paper is to enhance the Spoken Language Identification (SLID) model using hybrid machine learning with deep learning model for regionally spoken languages of Jammu & Kashmir (JK) and Ladakh. Method: Initially, the speech signals of different languages of JK and Ladakh are manually collected from diverse sources, and it is preprocessed using Spectral Noise Gate (SNG) filtering technique. Once the speech signals are pre-processed, the feature extraction is performed by the cepstral features like Mel-frequency Cepstral Coefficients (MFCCs), Relative Spectral Transform-Perceptual Linear Prediction (RASTA-PLP), and spectral features like spectral roll off, spectral flatness. Findings: From this feature extraction, the length of the feature vector seems to be long, and it is required to reduce the feature length. Hence, optimal feature selection is accomplished using the new meta-heuristic algorithm termed Adaptive Distance-based Tunicate Swarm Algorithm (AD-TSA) by considering the minimum correlation as objective. Finally, the language identification is handled by the hybrid classifier termed Improved Support Vector Machine-Recurrent Neural Network (ISVM-RNN). Novelty: The identification learning algorithm is enhanced by the AD-TSA by considering the minimum correlation as objective among features in order to get minimum number of features that are sufficient for language identification process. The efficiency of the proposed hybrid approach is validated by simulating the experiment on a user-defined language database of JK and Ladakh speech signals in the working platform of Python.

Keywords: Language Identification; Kashmir Languages; Optimal Feature Selection; Improved Support Vector MachineRecurrent Neural Network; Adaptive DistanceBased Tunicate Swarm Algorithm

References

Das A, Guha S, Singh PK, Ahmadian A, Senu N, Sarkar R. A Hybrid Meta-Heuristic Feature Selection Method for Identification of Indian Spoken Languages From Audio Signals. IEEE Access. 2020;8:181432–181449. Available from: https://doi.org/10.1109/ACCESS.2020.3028241
Ma Z, Yu H, Chen W, Guo J. Short Utterance Based Speech Language Identification in Intelligent Vehicles With Time-Scale Modifications and Deep Bottleneck Features. IEEE Transactions on Vehicular Technology. 2019;68(1):121–128. Available from: https://doi.org/10.1109/TVT.2018.2879361
Albadr MAA, Tiun S. Spoken Language Identification Based on Particle Swarm Optimisation–Extreme Learning Machine Approach. Circuits, Systems, and Signal Processing. 2020;39(9):4596–4622. Available from: https://doi.org/10.1007/s00034-020-01388-9
Deshwal D, Sangwan P, Kumar D. A Language Identification System using Hybrid Features and Back-Propagation Neural Network. Applied Acoustics. 2020;164(107289):107289. Available from: https://doi.org/10.1016/j.apacoust.2020.107289
Garain A, Singh PK, Sarkar R. FuzzyGCP: A deep learning architecture for automatic spoken language identification from speech signals. Expert Systems with Applications. 2021;168(114416):114416. Available from: https://doi.org/10.1016/j.eswa.2020.114416
Bakshi A, Kopparapu SK. Feature selection for improving Indian spoken language identification in utterance duration mismatch condition. Bulletin of Electrical Engineering and Informatics. 2021;10(5):2578–2587. Available from: https://doi.org/10.11591/eei.v10i5.3173
Alashban AA, Qamhan MA, Meftah AH, Alotaibi YA. Spoken Language Identification System Using Convolutional Recurrent Neural Network. Applied Sciences. 2022;12(18):9181. Available from: https://doi.org/10.11591/eei.v10i5.3173
Singh G, Sharma S, Kumar V, Kaur M, Baz M, Masud M. Spoken language identification using deep learning. 2021. Available from: https://doi:10.1155/2021/5123671
Thukroo IA, Bashir R. Spoken Language Identification System for Kashmiri and Related Languages Using Mel-Spectrograms and Deep Learning Approach. 2021 7th International Conference on Signal Processing and Communication (ICSC). 2021;p. 250–255. Available from: https://doi.org/10.3390/app12189181
Hou JC, Wang SS, Lai YH, Tsao Y, Chang HW, Wang HM. Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks. IEEE Transactions on Emerging Topics in Computational Intelligence. 2018;2(2):117–128. Available from: https://doi.org/10.1109/TETCI.2017.2784878
Deshwal D, Sangwan P, Kumar D. A Language Identification System using Hybrid Features and Back-Propagation Neural Network. Applied Acoustics. 2020;164:107289. Available from: https://doi.org/10.1016/j.apacoust.2020.107289
Sharma G, Umapathy K, Krishnan S. Trends in audio signal feature extraction methods. Applied Acoustics. 2020;158:107020. Available from: https://doi.org/10.1016/j.apacoust.2019.107020
Lim C, Lee SR, Chang JH. Efficient implementation of an SVM-based speech/music classifier by enhancing temporal locality in support vector references. IEEE Transactions on Consumer Electronics. 2012;58(3):898–904. Available from: https://doi.org/10.1109/TCE.2012.6311334
Dudhrejia H, Shah S. Speech Recognition using Neural Networks. International Journal Of Engineering Research & Technology. 2018;7. Available from: https://www.ijert.org/research/speech-recognition-using-neural-networks-IJERTV7IS100087.pdf
Satnamkaur LK, Awasthi AL, Sangal G. Tunicate Swarm Algorithm: A new bio-inspired based metaheuristic paradigm for global optimization. Engineering Applications of Artificial Intelligence. 2020;90. Available from: https://doi.org/10.1016/j.engappai.2020.103541
Selvaraj L, Ganesan B. Enhancing Speech Recognition Using Improved Particle Swarm Optimization Based Hidden Markov Model. The Scientific World Journal. 2014;2014:1–10. Available from: https://doi.org/10.1155/2014/270576
Ouadfel S, Elaziz MA. Enhanced Crow Search Algorithm for Feature Selection. Expert Systems with Applications. 2020;159:113572. Available from: https://doi.org/10.1016/j.eswa.2020.113572
Sapre S, S. M. Emulous mechanism based multi-objective moth–flame optimization algorithm. Journal of Parallel and Distributed Computing. 2021;150:15–33. Available from: https://doi.org/10.1016/j.jpdc.2020.12.010
Deshwal D, Sangwan P, Kumar D. A Language Identification System using Hybrid Features and Back-Propagation Neural Network. Applied Acoustics. 2020;164:107289. Available from: https://doi.org/10.1016/j.apacoust.2020.107289
Hou JC, Wang SS, Lai YH, Tsao Y, Chang HW, Wang HM. Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks. IEEE Transactions on Emerging Topics in Computational Intelligence. 2018;2(2):117–128. Available from: https://doi.org/10.1109/TETCI.2017.2784878
Chau G, Kemper G. One Channel Subvocal Speech Phrases Recognition Using Cumulative Residual Entropy and Support Vector Machines. IEEE Latin America Transactions. 2015;13(7):2135–2143. Available from: https://doi.org/10.1109/TLA.2015.7273769

Copyright

© 2023 Thukroo et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Published By Indian Society for Education and Environment (iSee)