Project Details
Abstract
Owing to the sponsor of National Science Council and Chang Gung University, the prime
investigator (PI) of this project has led a team composed of graduate students from Chang
Gung University and National Tsing Hua University to construct a well-established
Taiwanese/Mandarin bi-lingual speech research environment, including a large-scale
Taiwanese speech corpus, a bi-lingual (Mandarin/Taiwanese) lexicon, a bi-lingual speech
recognition engine, a Taiwanese Text-to-Speech (TTS) system as well as some additional
research fruits of speaker recognition and music signal recognition. Based on this groundwork,
the PI proposes this 3-year-long project to conduct a task of language boundary detection,
language identification and speech recognition on code-switching speech, including
Taiwanese-Mandarin, and Mandarin-English mixed-language speech. Due to the fact that
Taiwan is an example of multilingual society, where most of the people use code-switching
speech of Mandarin-Taiwanese, Mandarin-Hakka, or Mandarin-English in their everyday
conversation, we plan to collect the speech and text corpus on code-switching speech. To
make the system more complete and useful, not only the speech input technology but also the
speech output technology will be considered simultaneously.
On this project, we will put more emphases on the following aspects:
1. Collect the spontaneous English-Mandarin and Taiwanese-Mandarin
code-switching speech corpus form meeting, drama and published speech.
2. Collect spontaneous English-Mandarin code-switching text corpus form internet.
Develop and collect multi-lingual speech corpus.
3. Collect telephone and microphone-based English-Mandarin and
Taiwanese-Mandarin code-switching speech corpus and prepare for release to
speech recognition community.
4. Develop the technique of language boundary detection on code-switching speech.
5. Implement the model of language identification on code-switching speech.
6. Implement the kernel technique of speech recognition on code-switching speech.
7. Build pronunciation model to improve the recognition accuracy when a user utters
code-switching speech.
8. Integrate the three techniques of language boundary detection, language
identification and speech recognition.
We will follow the steps to achieve the above objectives in 3 years:
1. Collect spontaneous code-switching speech and text corpus. (1st year)
2. Develop the technique of language boundary detection (1st year)
3. Collect telephone and microphone-based code-switching speech (2nd year)
4. Validate the code-switching speech corpus. (2nd year)
5. Improve the language identifier (2st year)
6. Construct the code-switching Acoustic Model & Speech Recognizer (3rd year)
7. Build the model of pronunciation variations (3rd year)
8. Integrate the three techniques of language boundary detection, language identification
and speech recognition (3rd year).
Project IDs
Project ID:PB9706-0886
External Project ID:NSC96-2221-E182-032-MY3
External Project ID:NSC96-2221-E182-032-MY3
Status | Finished |
---|---|
Effective start/end date | 01/08/08 → 31/07/09 |
Fingerprint
Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.