Optimizing the acoustic modeling from an unbalanced bi-lingual corpus

Dau Cheng Lyu*, Ren Yuan Lyu

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Phoneme set clustering of accurate modeling is important in the task of multilingual speech recognition, especially when each of the available language training corpora is mismatched, such as is the case between a major language, like Mandarin, and a minor language, like Taiwanese. In this paper, we present a data-driven approach for not only acquiring a proper phoneme set but optimizing the acoustic modeling in this situation. In order to obtain the phoneme set that is suitable for the unbalanced corpus, we use an agglomerative hierarchical clustering with delta Bayesian information criteria. Then for training each of the acoustic models, we choose a parametric modeling technique, model complexity selection, to adjust the number of mixtures for optimizing the acoustic model between the new phoneme set and the available training data. The experimental results are very encouraging in that the proposed approach reduces relative syllable error rate by 7.8% over the best result of the knowledge-based approach.

Original languageEnglish
Title of host publication2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP
Pages4301-4304
Number of pages4
DOIs
StatePublished - 2008
Event2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP - Las Vegas, NV, United States
Duration: 31 03 200804 04 2008

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP
Country/TerritoryUnited States
CityLas Vegas, NV
Period31/03/0804/04/08

Keywords

  • Delta-BIC
  • Multilingual speech recognition
  • Terms-phoneme set clustering

Fingerprint

Dive into the research topics of 'Optimizing the acoustic modeling from an unbalanced bi-lingual corpus'. Together they form a unique fingerprint.

Cite this