TY - GEN
T1 - Optimizing the acoustic modeling from an unbalanced bi-lingual corpus
AU - Lyu, Dau Cheng
AU - Lyu, Ren Yuan
PY - 2008
Y1 - 2008
N2 - Phoneme set clustering of accurate modeling is important in the task of multilingual speech recognition, especially when each of the available language training corpora is mismatched, such as is the case between a major language, like Mandarin, and a minor language, like Taiwanese. In this paper, we present a data-driven approach for not only acquiring a proper phoneme set but optimizing the acoustic modeling in this situation. In order to obtain the phoneme set that is suitable for the unbalanced corpus, we use an agglomerative hierarchical clustering with delta Bayesian information criteria. Then for training each of the acoustic models, we choose a parametric modeling technique, model complexity selection, to adjust the number of mixtures for optimizing the acoustic model between the new phoneme set and the available training data. The experimental results are very encouraging in that the proposed approach reduces relative syllable error rate by 7.8% over the best result of the knowledge-based approach.
AB - Phoneme set clustering of accurate modeling is important in the task of multilingual speech recognition, especially when each of the available language training corpora is mismatched, such as is the case between a major language, like Mandarin, and a minor language, like Taiwanese. In this paper, we present a data-driven approach for not only acquiring a proper phoneme set but optimizing the acoustic modeling in this situation. In order to obtain the phoneme set that is suitable for the unbalanced corpus, we use an agglomerative hierarchical clustering with delta Bayesian information criteria. Then for training each of the acoustic models, we choose a parametric modeling technique, model complexity selection, to adjust the number of mixtures for optimizing the acoustic model between the new phoneme set and the available training data. The experimental results are very encouraging in that the proposed approach reduces relative syllable error rate by 7.8% over the best result of the knowledge-based approach.
KW - Delta-BIC
KW - Multilingual speech recognition
KW - Terms-phoneme set clustering
UR - http://www.scopus.com/inward/record.url?scp=51449089258&partnerID=8YFLogxK
U2 - 10.1109/ICASSP.2008.4518606
DO - 10.1109/ICASSP.2008.4518606
M3 - 会议稿件
AN - SCOPUS:51449089258
SN - 1424414849
SN - 9781424414840
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - 4301
EP - 4304
BT - 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP
T2 - 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP
Y2 - 31 March 2008 through 4 April 2008
ER -