Abstract
A large vocabulary Taiwanese (Min-nan) speech recognition system is described in this paper. Due to the severe multiple pronunciation phenomenon in Taiwanese partly caused by tone sandhi, a statistical pronunciation modeling technique based on tonal features is used. This system is speaker independent. It was trained by a bi-lingual Mandarin/Taiwanese speech corpus to alleviate the lack of pure Taiwanese speech corpus. The searching network is constructed based on nodes of Chinese characters and results in the direct output Chinese character string. Experiments show that by using the approaches proposed in this paper, the character error rate can decrease significantly from 21.50% to 11.97%.
Original language | English |
---|---|
Pages | 1861-1864 |
Number of pages | 4 |
State | Published - 2003 |
Event | 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 - Geneva, Switzerland Duration: 01 09 2003 → 04 09 2003 |
Conference
Conference | 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 |
---|---|
Country/Territory | Switzerland |
City | Geneva |
Period | 01/09/03 → 04/09/03 |