• P-ISSN 0974-6846 E-ISSN 0974-5645

Indian Journal of Science and Technology


Indian Journal of Science and Technology

Year: 2017, Volume: 10, Issue: 19, Pages: 1-6

Original Article

Confusion Matrix Analysis of Syllable-Like Unit Extracted from Hindi Continuous Speech


Objectives: Speech segmentation is important as a pre-processing task to improve the quality of TTS, and therefore an important field of research. The basic requirement is to build segmented and labeled speech corpora. The paper describes the database, the segmentation technique and confusion matrix analysis of segmentation of syllable structures occurring in the database. Methods: A set of 115 sentences, spoken by single male speaker chosen for providing information to passengers of Delhi metro rail are recorded. The analysis of speech database shows that syllable structures, namely, CV and CVC are most highly distributed covering 57.38% and 33.8% respectively and structures of CCVC, C C, CVCC ( where C=consonant, V=vowel and ‘ ’ represents nasalized- vowel sound)covers less than 2% in our database. The recorded speech sentences are segmented into syllable-like unit by using group-delay Algorithm. The segmentation technique has been tested with speech data of 704 tokens of syllables occurring in our database. The evaluation has been undertaken under two sections - the segmentation algorithm and the human perception approach. Findings: The quality of the segmentation evaluated using syllabic confusion matrix for various syllables structures demonstrate segmentation accuracy rate of 79%, 88.5%, for CV and CVC respectively. The overall accuracy of segmentation achieved on for metro rail passenger information system (MRPIS) task was 80.6%. Applications: The database be used in speech synthesizers and speech recognizers. 

Keywords: Database, Delhi Metro Rail Corporation, Group- Delay Algorithm, Metro Rail Passenger Information Corpus, Segmentation, Syllable


Subscribe now for latest articles and news.