Language identification on code-switching utterances using multiple cues

Dau Cheng Lyu*, Ren Yuan Lyu

*Corresponding author for this work

Research output: Contribution to journalConference articlepeer-review

47 Scopus citations

Abstract

Code-switching speech is an utterance containing two or more languages. Usually, the switching linguistic unit is in clause or word levels. In this paper, a two-stage framework is proposed, containing a language identifier and then a speech recognizer, to evaluate on a Mandarin-Taiwanese code-switching utterance. In the language identifier, we use multiple cues including acoustic, prosodic and phonetic features. In order to integrate the cues to distinguish one language from another, we used a maximum a posteriori decision rule to connect an acoustic model, a duration model and a language model. In the experiments, we have achieved 34.5% (LID) and 17.7% (ASR) error rate reduction comparing with one stage LVCSR-based system.

Original languageEnglish
Pages (from-to)711-714
Number of pages4
JournalProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH
StatePublished - 2008
EventINTERSPEECH 2008 - 9th Annual Conference of the International Speech Communication Association - Brisbane, QLD, Australia
Duration: 22 09 200826 09 2008

Keywords

  • Code-Switching speech
  • Language identification
  • Linguistic cues
  • Speech recognition

Fingerprint

Dive into the research topics of 'Language identification on code-switching utterances using multiple cues'. Together they form a unique fingerprint.

Cite this