Phonetic transcription using speech recognition technique considering variations in pronunciation

Min Siong Liang, Ren Yuan Lyu, Yuang Chin Chiang

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

5 Scopus citations

Abstract

We propose a new approach for performing phonetic transcription of speech and text that combines automatic speech recognition (ASR) and grapheme -to- phoneme (G2P) techniques. By augmenting the text with speech and using automatic speech recognition with a sausage searching net constructed from multiple text pronunciations corresponding to human speech utterance, we are able to reduce the effort for phonetic transcription. By using a multiple pronunciation lexicon, a transcription error rate of 12.74% was achieved. Further improvement can be achieved by adapting the pronunciation lexicon with pronunciation variation (PV) rules and an error rate reduction of 17.11% could be achieved.

Original languageEnglish
Title of host publication2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07
PagesIV109-IV112
DOIs
StatePublished - 2007
Event2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07 - Honolulu, HI, United States
Duration: 15 04 200720 04 2007

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
Volume4
ISSN (Print)1520-6149

Conference

Conference2007 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '07
Country/TerritoryUnited States
CityHonolulu, HI
Period15/04/0720/04/07

Keywords

  • Automatic phonetic transcription
  • Chinese
  • Dialect
  • Pronunciation variation
  • Taiwanese

Fingerprint

Dive into the research topics of 'Phonetic transcription using speech recognition technique considering variations in pronunciation'. Together they form a unique fingerprint.

Cite this