Language Detection, Identification and Speech Recognition on Code-Switching (Mixed-Language) Speech

Project: National Science and Technology CouncilNational Science and Technology Council Academic Grants

Project Details

Abstract

Owing to the sponsor of National Science Council and Chang Gung University, the prime investigator (PI) of this project has led a team composed of graduate students from Chang Gung University and National Tsing Hua University to construct a well-established Taiwanese/Mandarin bi-lingual speech research environment, including a large-scale Taiwanese speech corpus, a bi-lingual (Mandarin/Taiwanese) lexicon, a bi-lingual speech recognition engine, a Taiwanese Text-to-Speech (TTS) system as well as some additional research fruits of speaker recognition and music signal recognition. Based on this groundwork, the PI proposes this 3-year-long project to conduct a task of language boundary detection, language identification and speech recognition on code-switching speech, including Taiwanese-Mandarin, and Mandarin-English mixed-language speech. Due to the fact that Taiwan is an example of multilingual society, where most of the people use code-switching speech of Mandarin-Taiwanese, Mandarin-Hakka, or Mandarin-English in their everyday conversation, we plan to collect the speech and text corpus on code-switching speech. To make the system more complete and useful, not only the speech input technology but also the speech output technology will be considered simultaneously. On this project, we will put more emphases on the following aspects: 1. Collect the spontaneous English-Mandarin and Taiwanese-Mandarin code-switching speech corpus form meeting, drama and published speech. 2. Collect spontaneous English-Mandarin code-switching text corpus form internet. Develop and collect multi-lingual speech corpus. 3. Collect telephone and microphone-based English-Mandarin and Taiwanese-Mandarin code-switching speech corpus and prepare for release to speech recognition community. 4. Develop the technique of language boundary detection on code-switching speech. 5. Implement the model of language identification on code-switching speech. 6. Implement the kernel technique of speech recognition on code-switching speech. 7. Build pronunciation model to improve the recognition accuracy when a user utters code-switching speech. 8. Integrate the three techniques of language boundary detection, language identification and speech recognition. We will follow the steps to achieve the above objectives in 3 years: 1. Collect spontaneous code-switching speech and text corpus. (1st year) 2. Develop the technique of language boundary detection (1st year) 3. Collect telephone and microphone-based code-switching speech (2nd year) 4. Validate the code-switching speech corpus. (2nd year) 5. Improve the language identifier (2st year) 6. Construct the code-switching Acoustic Model & Speech Recognizer (3rd year) 7. Build the model of pronunciation variations (3rd year) 8. Integrate the three techniques of language boundary detection, language identification and speech recognition (3rd year).

Project IDs

Project ID:PB9706-0886
External Project ID:NSC96-2221-E182-032-MY3
StatusFinished
Effective start/end date01/08/0831/07/09

Fingerprint

Explore the research topics touched on by this project. These labels are generated based on the underlying awards/grants. Together they form a unique fingerprint.