Abstract
Recently, the data-driven machine learning approaches have shown their successes on many text classification tasks for a resource-abundant language. However, there are still many languages that lack of sufficient enough labeled data for carrying out the same specific tasks. They may be costly to obtain high-quality parallel corpus or cannot rely on automated machine translation due to unreliable or unavailable machine translation tools in those low-resource languages. In this work, we propose an effective transfer learning method in the scenarios where the large-scale cross-lingual data is not available. It combines transfer learning schemes of parameter sharing (parameter based) and domain adaptation (feature based) that are joint trained with high-resource and low-resource languages together. We conducted the cross-lingual transfer learning experiments on text classification on sentiment, subjectivity and question types from English to Chinese and from English to Vietnamese respectively. The experiments show that the proposed approach significantly outperformed the state-of-the-art models that are trained merely with monolingual data on the corresponding benchmarks.
Original language | English |
---|---|
Title of host publication | Computational Linguistics - 16th International Conference of the Pacific Association for Computational Linguistics, PACLING 2019, Revised Selected Papers |
Editors | Le-Minh Nguyen, Satoshi Tojo, Xuan-Hieu Phan, Kôiti Hasida |
Publisher | Springer |
Pages | 155-168 |
Number of pages | 14 |
ISBN (Print) | 9789811561672 |
DOIs | |
State | Published - 2020 |
Externally published | Yes |
Event | 16th International Conference of the Pacific Association for Computational Linguistics, PACLING 2019 - Hanoi, Viet Nam Duration: 11 10 2019 → 13 10 2019 |
Publication series
Name | Communications in Computer and Information Science |
---|---|
Volume | 1215 CCIS |
ISSN (Print) | 1865-0929 |
ISSN (Electronic) | 1865-0937 |
Conference
Conference | 16th International Conference of the Pacific Association for Computational Linguistics, PACLING 2019 |
---|---|
Country/Territory | Viet Nam |
City | Hanoi |
Period | 11/10/19 → 13/10/19 |
Bibliographical note
Publisher Copyright:© 2020, Springer Nature Singapore Pte Ltd.
Keywords
- Cross-Lingual
- Deep domain adaptation
- Transfer learning