Speaker Identification System using Gaussian Mixture Model and Support Vector Machines (GMM-SVM) under Noisy Conditions

R  Dhineshkumar; A  Balaji Ganesh  and S  Sasikala  nbsp

doi:10.17485/ijst/2016/v9i19/93870

Article

Speaker Identification System using Gaussian Mixture Model and Support Vector Machines (GMM-SVM) under Noisy Conditions

VIEWS 1012
PDF 813

Abstract
Full-Text HTML
Full-Text PDF
How to Cite

Indian Journal of Science and Technology

DOI: 10.17485/ijst/2016/v9i19/93870

Year: 2016, Volume: 9, Issue: 19, Pages: 1-6

Original Article

Speaker Identification System using Gaussian Mixture Model and Support Vector Machines (GMM-SVM) under Noisy Conditions

R. Dhineshkumar^1*, A. Balaji Ganesh² and S. Sasikala³

¹School of Computing Science and Engineering, Vellore Institute of Technology University, Chennai Campus, Vellore – 632014, Tamil Nadu, India; [email protected] ²TIFAC-CORE, Velammal Engineering College, Chennai - 600066, Tamil Nadu, India; [email protected] 3 Department of Computer Science, Institute of Distance Education, University of Madras, Chennai - 600005, Tamil Nadu, India; [email protected]

*Author for correspondence
R. Dhineshkumar
School of Computing Science and Engineering, Vellore Institute of Technology University, Chennai Campus, Vellore – 632014, Tamil Nadu, India; [email protected]

This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

Background: Automatic Speaker Identification (SID) systems has been a major breakthrough and crucial in many realworld applications. Methods: This work addresses the SID task based on GMM-SVM in a three stage process. Firstly, the Gammatone Frequency Cepstral Coefficients (GFCC) and Mean Hilbert Envelope Coefficients (MHEC) of the speakers are extracted. Secondly, these features are modeled using Gaussian Mixture Model (GMM), on adapting the extracted acoustic features by mean, the corresponding super vectors are found and these vectors are trained using Support Vector Machine (SVM). Finally, the actual recognition is done by feeding the super vectors of them asked noisy test utterance by Ideal Binary Mask (IBM) into SVM model and their accuracy of recognition is compared for GFCC, MHEC and RASTA-MFCC in different noisy conditions. Findings: Evaluation results show that SID performance carried out with MHEC is extensively better than the performance of other two features. Applications: Major areas that implements automatic SIDs are forensics, surveillance and audio biometrics etc.

Keywords: GMM-SVM, Gamma tone Frequency Cepstral Coefficients, Ideal Binary Mask, Mean Hilbert Envelope Coefficients, Robust Speaker Identification