Deep Domain Adaptation for Low-Resource Cross-Lingual Text Classification Tasks

Guan Yuan Chen*, Von Wun Soo

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations

Abstract

Recently, the data-driven machine learning approaches have shown their successes on many text classification tasks for a resource-abundant language. However, there are still many languages that lack of sufficient enough labeled data for carrying out the same specific tasks. They may be costly to obtain high-quality parallel corpus or cannot rely on automated machine translation due to unreliable or unavailable machine translation tools in those low-resource languages. In this work, we propose an effective transfer learning method in the scenarios where the large-scale cross-lingual data is not available. It combines transfer learning schemes of parameter sharing (parameter based) and domain adaptation (feature based) that are joint trained with high-resource and low-resource languages together. We conducted the cross-lingual transfer learning experiments on text classification on sentiment, subjectivity and question types from English to Chinese and from English to Vietnamese respectively. The experiments show that the proposed approach significantly outperformed the state-of-the-art models that are trained merely with monolingual data on the corresponding benchmarks.

Original languageEnglish
Title of host publicationComputational Linguistics - 16th International Conference of the Pacific Association for Computational Linguistics, PACLING 2019, Revised Selected Papers
EditorsLe-Minh Nguyen, Satoshi Tojo, Xuan-Hieu Phan, Kôiti Hasida
PublisherSpringer
Pages155-168
Number of pages14
ISBN (Print)9789811561672
DOIs
StatePublished - 2020
Externally publishedYes
Event16th International Conference of the Pacific Association for Computational Linguistics, PACLING 2019 - Hanoi, Viet Nam
Duration: 11 10 201913 10 2019

Publication series

NameCommunications in Computer and Information Science
Volume1215 CCIS
ISSN (Print)1865-0929
ISSN (Electronic)1865-0937

Conference

Conference16th International Conference of the Pacific Association for Computational Linguistics, PACLING 2019
Country/TerritoryViet Nam
CityHanoi
Period11/10/1913/10/19

Bibliographical note

Publisher Copyright:
© 2020, Springer Nature Singapore Pte Ltd.

Keywords

  • Cross-Lingual
  • Deep domain adaptation
  • Transfer learning

Fingerprint

Dive into the research topics of 'Deep Domain Adaptation for Low-Resource Cross-Lingual Text Classification Tasks'. Together they form a unique fingerprint.

Cite this