Indian Journal of Science and Technology
DOI: 10.17485/ijst/2015/v8i20/78461
Year: 2015, Volume: 8, Issue: 20, Pages: 1-10
Original Article
M. A. Yusnita1* , M. P. Paulraj 2 , Sazali Yaacob3 , R. Yusuf 1 and M. Nor Fadzilah1
1 Faculty of Electrical Engineering, Universiti Teknologi MARA Malaysia, Permatang Pauh - 13500, Penang, Malaysia; [email protected]
2 School of Mechatronic Engineering, University Malaysia Perlis, Ulu Pauh - 02600, Perlis, Malaysia
3 Universiti Kuala Lumpur Malaysian Spanish Institute, Kulim Hitech Park, Kulim - 09000, Kedah, Malaysia
To date, Malaysian English (MalE) accents arises from different ethnics of its populace are scarcely investigated using empirical methods that give a decisive conclusion to treat MalE as either uniform or non-uniform variety. The popularly used Mel-Frequency Cepstral Coefficients (MFCC) and Linear Prediction Coefficients (LPC) as feature extractors fail to perform well under noisy conditions. This paper proposes two new methods and noise less-susceptible feature extractors to mitigate the deficiency of MFCC and LPC. Statistical descriptors of Mel-Bands Spectral Energy (MBSE)is an enhancement of traditional filter-bank analysis, however, increases fourfold as much the feature size. This issue is tackled by proposing a transformation using principle component analysis to generate a new PCA-MBSE feature set. Experimental results indicated promising accuracy rates of 92.7% and 93.0% using the proposed PCA-MBSE features to recognize between the Malay, Chinese and Indian accents of MalE speech for the male and female datasets respectively. It was found that under severe noisy conditions, the standard MFCC and LPC features started to deteriorate faster than the MBSE-based features. PCA-MBSE features showed the most robust quality where its performance was just slightly deteriorated by 17.1% and 13.6% as compared to MBSE features i.e. 33.1% and 31.3% for the male and female datasets respectively. Further poor results of LPC features were obtained indicating deterioration rates of 40.2% and 32.7%, while that of MFCC features of 35.7% and 36.8% for the male and female datasets respectively.
Keywords: Accent Recognition, K-Nearest Neighbors, Linear Prediction Coefficients, Malaysian English, Mel-Bands Spectral Energy, Mel-Frequency Cepstral Coefficients, Principle Component Analysis
Subscribe now for latest articles and news.