Total views : 681
A Hierarchical Approach in Tamil Phoneme Classification using Support Vector Machine
Most of the speech recognition systems are designed based on the sub-word unit phoneme which is the basic sound unit of a language. In the proposed work, a novel hierarchical approach based phoneme classification task has been carried out to reduce time complexity and search space. Hierarchical classification of set of Tamil phonemes has been done in three levels. Phoneme boundaries of the given speech utterance are identified using Spectral Transition Measure (STM) and phonemes are separated. Mel-Frequency Cepstral Coefficients (MFCC) are extracted for each phoneme represented by 9 frames including the contextual frames of corresponding phoneme. In each hierarchical level, different number of models is built using Support Vector Machine (SVM) for classifying each phoneme group/phoneme. It is observed from the results that in hierarchical approach phoneme group recognition rate at level 1 and 2 has greatly improved compared to flat classification model. Complexity of search space is significantly reduced at level 2 and level 3 contrasts to flat phoneme classification model. Hierarchical phoneme classifier can be very well employed in phoneme recognition task which is useful in applications such as spoken term detection, out-ofvocabulary detection, named entity recognition, spoken document retrieval.
Mel-Frequency Cepstral Coefficients, Spectral Transition Measure
- Ilakkuvanar S. Tholkappiyam (in English). India: Kural Neri Publishing House; 1963.
- Keane E. Tamil Journal of the International Phonetic Association.2004; 34(1):111–6.
- Karunakaran K, Mozhiyiyal JV. India: Kavitha Pathippakam;1997.
- Driaunys KK, Rudzionis VV, Zvinys PP. Implementation of hierarchical phoneme classification approach on LTDIGITS data. Information Technology and Control. 2009;38(4):303–10.
- Grayden DB, Scordilis MS. Recognition of obstruent phonemes in speaker independent fluent speech using a hierarchical approach. Proceedings of 3rd European conference on speech communication and technology, Eurospeech’93;1993. p. 855–8.
- Hamooni H, Mueen A. Dual-domain hierarchical classification of phonetic time series. Proceedings of 2014 IEEE International conference on Data Mining; Shen. 2014. p.160–9.
- Samudravijaya K, Ahuja R, Bondale N, Jose T, Krishnan S, Poddar P, Rao PVS, Raveedran R, Ahuja R, Bondale N, Jose T, Krishnan S, Poddar P, Rao PVS, Raveedran R. A feature-based hierarchical speech recognition system for Hindi.Academy Proceedings in Engineering Sciences. 1998;23(4):313–40.
- Bresolin AA, Neto ADD, Alsina PJ. A new hierarchical decision structure using wavelet packet and SVM for Brazilian phonemes recognition. Proceedings of the 13th International Conference on Neural Information Processing; 2006.p. 159–66.
- Renals SS, Rohwer RR. Phoneme classification experiments using radial basis functions. International Joint Conference on Neural Networks; Washington DC, USA. 1989. p. 461–7.
- Khoo L, Cvetkovic Z, Sollich P. Robustness of phonemeclassification in different representation spaces. 14th European Signal Processing Conference; Florence. 2006. p. 1–5.
- Pinto J, Yegnanarayana B, Hermansky H, Magimai-Doss M. Exploiting contextual information for improved phoneme recognition. IEEE International Conference on Acoustics, Speech and Signal Processing; Las Vegas, USA. 2008. p.4449–52.
- Karpagavalli S, Chandra E. Tamil phoneme classification using contextual features and discriminative models. 2015 IEEE International Conference on Communications and Signal Processing; Melmaruvathur. 2015 Apr 2-4. p. 564–8.
- Sai Jayram AKV, Ramasubramanian V, Sreenivas TV. Robust parameters for automatic segmentation of speech. IEEE International Conference on Acoustics, Speech, and Signal Processing; Orlando, FL, USA. 2002 May 13-17. p.513–6.
- Dusan S, Rabiner L. On the relation between maximum spectral transition positions and phone boundaries. Proceedings of 9th International Conference on Spoken Language Processing; 2006. p. 645-1-4.
- Rabiner R, Juang B-H. Fundamentals of speech recognition.New Jersey: Prentice-Hall International; 1993.
- Soman KP, Loganathan R, Ajay V. Machine learning with SVM and other kernel methods. India: PHI; 2009.
- Vapnik V. Statistical learning theory. New York: Wiley-Interscience;1998.
- Weston J, Watkins C. Multi-class support vector machines.In: Verleysen M, editor. Proceedings of European Symposium on Artificial Neural Networks; Brussels, Belgium.1999. p. 1–9.
- There are currently no refbacks.
This work is licensed under a Creative Commons Attribution 3.0 License.