TY - JOUR
T1 - Cross-lingual audio-to-text alignment for multimedia content management
AU - Lyu, Dau Cheng
AU - Lyu, Ren Yuan
AU - Chiang, Yuang Chin
AU - Hsu, Chun Nan
PY - 2008/6
Y1 - 2008/6
N2 - This paper addresses a content management problem in situations where we have a collection of spoken documents in audio stream format in one language and a collection of related text documents in another. In our case, we have a huge digital archive of audio broadcast news in Taiwanese, but its transcriptions are unavailable. Meanwhile, we have a collection of related text-based news stories, but they are written in Chinese characters. Due to the lack of a standard written form for Taiwanese, manual transcription of spoken documents is prohibitively expensive, and automatic transcription by speech recognition is infeasible because of its poor performance for Taiwanese spontaneous speech. We present an approximate solution by aligning Taiwanese spoken documents with related text documents in Mandarin. The idea is to take advantage of the abundance of Mandarin text documents available in our application to compensate for the limitations of speech recognition systems. Experimental results show that even though our speech recognizer for spontaneous Taiwanese performs poorly, our approach still achieve a high (82.5%) alignment accuracy for sufficient for content management.
AB - This paper addresses a content management problem in situations where we have a collection of spoken documents in audio stream format in one language and a collection of related text documents in another. In our case, we have a huge digital archive of audio broadcast news in Taiwanese, but its transcriptions are unavailable. Meanwhile, we have a collection of related text-based news stories, but they are written in Chinese characters. Due to the lack of a standard written form for Taiwanese, manual transcription of spoken documents is prohibitively expensive, and automatic transcription by speech recognition is infeasible because of its poor performance for Taiwanese spontaneous speech. We present an approximate solution by aligning Taiwanese spoken documents with related text documents in Mandarin. The idea is to take advantage of the abundance of Mandarin text documents available in our application to compensate for the limitations of speech recognition systems. Experimental results show that even though our speech recognizer for spontaneous Taiwanese performs poorly, our approach still achieve a high (82.5%) alignment accuracy for sufficient for content management.
KW - Audio document retrieval
KW - Cross-language information retrieval
KW - Parallel document alignment
KW - Speech recognition
UR - http://www.scopus.com/inward/record.url?scp=44849128022&partnerID=8YFLogxK
U2 - 10.1016/j.dss.2007.07.003
DO - 10.1016/j.dss.2007.07.003
M3 - 文章
AN - SCOPUS:44849128022
SN - 0167-9236
VL - 45
SP - 554
EP - 566
JO - Decision Support Systems
JF - Decision Support Systems
IS - 3
ER -