TY - GEN
T1 - Maximum entropy modeling of acoustic and linguistic features
AU - Chueh, Chuang Hua
AU - Chien, Jen Tzung
PY - 2006/5
Y1 - 2006/5
N2 - Traditionally, speech recognition system is established assuming that acoustic and linguistic information sources are independent, Parameters of hidden Markov model and n-gram are estimated individually and then plugged in a maximum a posteriori classification rule. However, acoustic and linguistic features are correlated in essence. Modeling performance is limited accordingly, This study aims to relax the independence assumption and achieve sophisticated acoustic and linguistic modeling for speech recognition. We propose an integrated approach based on maximum entropy (ME) principle where acoustic and linguistic features are optimally merged in a unified framework. The correlations between acoustic and linguistic features are explored and properly represented in the integrated models. Due to the flexibility of ME model, we can further combine other high-level linguistic features, In the experiments, we carry out the proposed methods for broadcast news transcription using MATBN database. We obtain significant improvement compared to conventional speech recognition system using individual maximum likelihood training.
AB - Traditionally, speech recognition system is established assuming that acoustic and linguistic information sources are independent, Parameters of hidden Markov model and n-gram are estimated individually and then plugged in a maximum a posteriori classification rule. However, acoustic and linguistic features are correlated in essence. Modeling performance is limited accordingly, This study aims to relax the independence assumption and achieve sophisticated acoustic and linguistic modeling for speech recognition. We propose an integrated approach based on maximum entropy (ME) principle where acoustic and linguistic features are optimally merged in a unified framework. The correlations between acoustic and linguistic features are explored and properly represented in the integrated models. Due to the flexibility of ME model, we can further combine other high-level linguistic features, In the experiments, we carry out the proposed methods for broadcast news transcription using MATBN database. We obtain significant improvement compared to conventional speech recognition system using individual maximum likelihood training.
UR - http://www.scopus.com/inward/record.url?scp=33947694338&partnerID=8YFLogxK
M3 - 会议稿件
AN - SCOPUS:33947694338
SN - 142440469X
SN - 9781424404698
T3 - ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
SP - I1061-I1064
BT - 2006 IEEE International Conference on Acoustics, Speech, and Signal Processing - Proceedings
T2 - 2006 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2006
Y2 - 14 May 2006 through 19 May 2006
ER -