Abstract
State-of-the-art machine translation (MT) systems are typically trained to generate "standard"target language; however, many languages have multiple varieties (regional varieties, dialects, sociolects, non-native varieties) that are different from the standard language. Such varieties are often low-resource, and hence do not benefit from contemporary NLP solutions, MT included. We propose a general framework to rapidly adapt MT systems to generate language varieties that are close to, but different from, the standard target language, using no parallel (source- variety) data. This also includes adaptation of MT systems to low-resource typologicallyrelated target languages.1 We experiment with adapting an English-Russian MT system to generate Ukrainian and Belarusian, an English-Norwegian Bokmål system to generate Nynorsk, and an English-Arabic system to generate four Arabic dialects, obtaining significant improvements over competitive baselines.
| Original language | English |
|---|---|
| Title of host publication | ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference |
| Publisher | Association for Computational Linguistics (ACL) |
| Pages | 110-121 |
| Number of pages | 12 |
| ISBN (Electronic) | 9781954085534 |
| DOIs | |
| State | Published - 2021 |
| Event | Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021 - Virtual, Online Duration: 1 Aug 2021 → 6 Aug 2021 |
Publication series
| Name | ACL-IJCNLP 2021 - 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Proceedings of the Conference |
|---|---|
| Volume | 2 |
Conference
| Conference | Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, ACL-IJCNLP 2021 |
|---|---|
| City | Virtual, Online |
| Period | 1/08/21 → 6/08/21 |
Bibliographical note
Publisher Copyright:© 2021 Association for Computational Linguistics.
ASJC Scopus subject areas
- Software
- Computational Theory and Mathematics
- Linguistics and Language
- Language and Linguistics