Function words in statistical machine-translated Chinese and original Chinese: A study into the translationese of machine translation systems

Chen Li Kuo*

*Corresponding author for this work

Research output: Contribution to journalJournal Article peer-review

7 Scopus citations

Abstract

Statistical approaches have become the mainstream in machine translation (MT), for their potential in producing less rigid and more natural translations than rule-based approaches. However, on closer examination, the uses of function words between statistical machine-translated Chinese and the original Chinese are different, and such differences may be associated with translationese as discussed in translation studies. This article examines the distribution of Chinese function words in a comparable corpus consisting of MTs and the original Chinese texts extracted from Wikipedia. An attribute selection technique is used to investigate which types of function words are significant in discriminating between statistical machine-translated Chinese and the original texts. The results show that statistical MT overuses the most frequent function words, even when alternatives exist. To improve the quality of the end product, developers of MT should pay close attention to modelling Chinese conjunctions and adverbial function words. The results also suggest that machine-translated Chinese shares some characteristics with human-translated texts, including normalization and being influenced by the source language; however, machine-translated texts do not exhibit other characteristics of translationese such as explicitation.

Original languageEnglish
Pages (from-to)752-771
Number of pages20
JournalDigital Scholarship in the Humanities
Volume34
Issue number4
DOIs
StatePublished - 01 12 2019

Bibliographical note

Publisher Copyright:
© 2018 The Author(s). Published by Oxford University Press on behalf of EADH. All rights reserved. For permissions, please email: [email protected].

Fingerprint

Dive into the research topics of 'Function words in statistical machine-translated Chinese and original Chinese: A study into the translationese of machine translation systems'. Together they form a unique fingerprint.

Cite this