Automatic Speech Recognition of Pathological Voice

Algabri Mohammed; Alsulaiman Mansour; Muhammad Ghulam; Zakariah Mohammed; Tamer A  Mesallam; Khalid H  Malki; Farahat Mohamed; M  A  Mekhtiche  and Bencherif Mohamed

doi:10.17485/ijst/2015/v8i32/92130

Article

Automatic Speech Recognition of Pathological Voice

VIEWS 1860
PDF 1385

Abstract
Full-Text HTML
Full-Text PDF
How to Cite

Indian Journal of Science and Technology

DOI: 10.17485/ijst/2015/v8i32/92130

Year: 2015, Volume: 8, Issue: 32, Pages: 1-6

Original Article

Automatic Speech Recognition of Pathological Voice

Algabri Mohammed¹ , Alsulaiman Mansour¹ , Muhammad Ghulam¹ , Zakariah Mohammed¹ , Tamer A. Mesallam² , Khalid H. Malki² , Farahat Mohamed² , M. A. Mekhtiche¹ and Bencherif Mohamed¹

¹College of Computer and Information Sciences, King Saud University, Riyadh, Saudi Arabia;
[email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
² College of Medicine, King Saud University, Riyadh, Saudi Arabia;
[email protected], [email protected], [email protected]

This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

Background/Objectives: Automatic speech recognition (ASR) benefits human beings in many useful applications. Various ASR systems exhibiting good performance have been developed for normal speakers. The speech produced by a voice disordered patient is not like a normal speaker due to irregular vibration and incomplete closure of vocal fold. Therefore, an investigation is required by exploring the different speech features to develop an ASR system which can perform well for both pathological and normal speakers. Methods: In this paper, we proposed an automatic speech recognition system using Hidden Markov Model Toolkit (HTK) for normal and pathology voice. Four techniques are applied for feature extraction; Mel Frequency Cepstral Coefficient (MFCC), Perceptual Linear Prediction (PLP), RelAtiveSpecTrA - Perceptual Linear Predictive (RASTA-PLP), and linear prediction coefficients (LPC). The database that used to evaluate the performance of the developed system; includes a total of 297 speakers 121 of them were normal speakers and the remaining containing five types of vocal fold disorders. Findings: Experimental results show that the developed system gives good accuracies for normal and pathology voice. The highest accuracy of 94.44 % with a word error rate 5.55% is achieved in case of normal voice, and 88.63 % with a word error rate 11.63 % in case of pathology voice. Fuzzy logic controller is proposed to automatically segmentation the normal and disorders voice.
Keywords: Automatic Speech Recognition, Fuzzy Logic Control, HTK, Voice Pathology