Indian Journal of Science and Technology
Year: 2016, Volume: 9, Issue: 38, Pages: 1-11
Palo Hemanta Kumar and Mihir N. Mohanty
Background/Objectives: Analysis of human speech emotion has been continued since long. As the study and recognition helps the society in many respects, we intend to analyze the similar type of emotions. Methods/Statistical Analysis: ‘Fear’ and ‘Nervousness’ are being analyzed in comparison with normal voice. The correlation between these two emotions found to be very close. These voices belong to Oriya language. The popular features of speech, Mel-frequency cepstral coefficients (MFCCs) are used. As the fundamental frequency is unique from voice to voice, it is a suitable feature in case of similar voice signals. Findings: The combination of these two features outperformed the single feature based classification. In addition, the performance has been measured using log-likelihood ratio parameter. For recognition purpose, Gaussian mixture model (GMM) has been selected, and tested for these features. Novelty/Improvement: The individual MFCCs show 81.33%, whereas the combined features show 86.01% of accuracy. It is clearly evidenced in the result section.
Keywords: Correlation Coefficient, Feature Extraction, Fear State, Gaussian Mixture Model, Human Voice, Log-likelihood Ratio, Mel-frequncy Cepstral Coefficient
Subscribe now for latest articles and news.