運用 Python 結合語音辨識及合成技術於自動化音文同步之實作

Translated title of the contribution: A python implementation of automatic speech-text synchronization using speech recognition and text-to-speech technology

Chun Han Lai, Chao Kai Chang, Renyuan Lyu

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

In this study, we establish a method to create speech and text synchronized audiobooks with “speech recognition” and “cloud text-to-speech” technology. The user can prepare his own arbitrary articles to create the learning materials for "Shadowing technique" with this method. Besides, the materials are made by "word-level" speech and text synchronized audiobooks. These audiobooks are created by "timed-text" files, and the files are produced from the user's articles and corresponding speech files. By synchronization for speech and text technology, named "CGUAlign", user can easily make the "Timed-text" files. CGUAlign, uses Python to wrap the well-known speech recognition technology-HTK(Hidden Markov Model Toolkit). Just providing text file and the corresponding speech file, obtained from cloud text-to-speech technology, CGUAlign can create the timed-text file to achieve the synchronization of speech and text. Subsequently, we also build a simple website created with JavaScript. This website can use the timed-text file as CALL(Computer-assisted Language Learning) purposes. Using the website, user can browse the synchronized audiobooks to easily do Shadowing technique. Finally this website also provides dictionary function to achieve the goal of CALL.

Translated title of the contributionA python implementation of automatic speech-text synchronization using speech recognition and text-to-speech technology
Original languageChinese (Traditional)
Title of host publicationProceedings of the 27th Conference on Computational Linguistics and Speech Processing, ROCLING 2015
EditorsSin-Horng Chen, Hsin-Min Wang, Jen-Tzung Chien, Hung-Yu Kao, Wen-Whei Chang, Yih-Ru Wang, Shih-Hung Wu
PublisherThe Association for Computational Linguistics and Chinese Language Processing (ACLCLP)
Pages289-305
Number of pages17
ISBN (Electronic)9789573079286
StatePublished - 01 10 2015
Event27th Conference on Computational Linguistics and Speech Processing, ROCLING 2015 - Hsinchu, Taiwan
Duration: 01 10 201502 10 2015

Publication series

NameProceedings of the 27th Conference on Computational Linguistics and Speech Processing, ROCLING 2015

Conference

Conference27th Conference on Computational Linguistics and Speech Processing, ROCLING 2015
Country/TerritoryTaiwan
CityHsinchu
Period01/10/1502/10/15

Bibliographical note

Publisher Copyright:
© Proceedings of the 27th Conference on Computational Linguistics and Speech Processing, ROCLING 2015.

Fingerprint

Dive into the research topics of 'A python implementation of automatic speech-text synchronization using speech recognition and text-to-speech technology'. Together they form a unique fingerprint.

Cite this