Statistical Descriptors of Mel-Bands Spectral Energy Features with Feature Reduction for Robust Accent Recognition in Malaysian English

M  A  Yusnita; M  P  Paulraj; Sazali Yaacob; R  Yusuf   and M  Nor Fadzilah

doi:10.17485/ijst/2015/v8i20/78461

Article

Statistical Descriptors of Mel-Bands Spectral Energy Features with Feature Reduction for Robust Accent Recognition in Malaysian English

VIEWS 792
PDF 743

Abstract
Full-Text HTML
Full-Text PDF
How to Cite

Indian Journal of Science and Technology

DOI: 10.17485/ijst/2015/v8i20/78461

Year: 2015, Volume: 8, Issue: 20, Pages: 1-10

Original Article

Statistical Descriptors of Mel-Bands Spectral Energy Features with Feature Reduction for Robust Accent Recognition in Malaysian English

M. A. Yusnita^1* , M. P. Paulraj ² , Sazali Yaacob³ , R. Yusuf ¹ and M. Nor Fadzilah¹

¹Faculty of Electrical Engineering, Universiti Teknologi MARA Malaysia, Permatang Pauh - 13500, Penang, Malaysia; [email protected]
²School of Mechatronic Engineering, University Malaysia Perlis, Ulu Pauh - 02600, Perlis, Malaysia
³ Universiti Kuala Lumpur Malaysian Spanish Institute, Kulim Hitech Park, Kulim - 09000, Kedah, Malaysia

This work is licensed under a Creative Commons Attribution 4.0 International License.

Abstract

To date, Malaysian English (MalE) accents arises from different ethnics of its populace are scarcely investigated using empirical methods that give a decisive conclusion to treat MalE as either uniform or non-uniform variety. The popularly used Mel-Frequency Cepstral Coefficients (MFCC) and Linear Prediction Coefficients (LPC) as feature extractors fail to perform well under noisy conditions. This paper proposes two new methods and noise less-susceptible feature extractors to mitigate the deficiency of MFCC and LPC. Statistical descriptors of Mel-Bands Spectral Energy (MBSE)is an enhancement of traditional filter-bank analysis, however, increases fourfold as much the feature size. This issue is tackled by proposing a transformation using principle component analysis to generate a new PCA-MBSE feature set. Experimental results indicated promising accuracy rates of 92.7% and 93.0% using the proposed PCA-MBSE features to recognize between the Malay, Chinese and Indian accents of MalE speech for the male and female datasets respectively. It was found that under severe noisy conditions, the standard MFCC and LPC features started to deteriorate faster than the MBSE-based features. PCA-MBSE features showed the most robust quality where its performance was just slightly deteriorated by 17.1% and 13.6% as compared to MBSE features i.e. 33.1% and 31.3% for the male and female datasets respectively. Further poor results of LPC features were obtained indicating deterioration rates of 40.2% and 32.7%, while that of MFCC features of 35.7% and 36.8% for the male and female datasets respectively.
Keywords: Accent Recognition, K-Nearest Neighbors, Linear Prediction Coefficients, Malaysian English, Mel-Bands Spectral Energy, Mel-Frequency Cepstral Coefficients, Principle Component Analysis