Automatic detection of translation direction

Ilia Sominsky, Shuly Wintner

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Parallel corpora are crucial resources for NLP applications, most notably for machine translation. The direction of the (human) translation of parallel corpora has been shown to have significant implications for the quality of statistical machine translation systems that are trained with such corpora. We describe a method for determining the direction of the (manual) translation of parallel corpora at the sentence-pair level. Using several linguistically-motivated features, coupled with a neural network model, we obtain high accuracy on several language pairs. Furthermore, we demonstrate that the accuracy is correlated with the (typological) distance between the two languages.

Original languageEnglish
Title of host publicationInternational Conference on Recent Advances in Natural Language Processing in a Deep Learning World, RANLP 2019 - Proceedings
EditorsGalia Angelova, Ruslan Mitkov, Ivelina Nikolova, Irina Temnikova, Irina Temnikova
PublisherIncoma Ltd
Pages1131-1140
Number of pages10
ISBN (Electronic)9789544520557
DOIs
StatePublished - 2019
Event12th International Conference on Recent Advances in Natural Language Processing, RANLP 2019 - Varna, Bulgaria
Duration: 2 Sep 20194 Sep 2019

Publication series

NameInternational Conference Recent Advances in Natural Language Processing, RANLP
Volume2019-September
ISSN (Print)1313-8502

Conference

Conference12th International Conference on Recent Advances in Natural Language Processing, RANLP 2019
Country/TerritoryBulgaria
CityVarna
Period2/09/194/09/19

Bibliographical note

Funding Information:
This research was supported by Grant No. 2017699 from the United States-Israel Binational Science Foundation (BSF), by Grant No. 1813153 from the United States Na- tional Science Foundation (NSF), and by grant No. LU 856/13-1 from the Deutsche Forschungs-gemeinschaft. We are grateful to Noam Ordan and Ella Rabinovich for helpful discussions and creative ideas. We thank the anonymous RANLP reviewers for their constructive comments. All remaining errors and misconceptions are, of course, our own.

Funding Information:
This research was supported by Grant No. 2017699 from the United States-Israel Binational Science Foundation (BSF), by Grant No. 1813153 from the United States National Science Foundation (NSF), and by grant No. LU 856/13-1 from the Deutsche Forschungsgemeinschaft. We are grateful to Noam Ordan and Ella Rabinovich for helpful discussions and creative ideas. We thank the anonymous RANLP reviewers for their constructive comments. All remaining errors and misconceptions are, of course, our own.

Publisher Copyright:
© 2019 Association for Computational Linguistics (ACL). All rights reserved.

ASJC Scopus subject areas

  • Software
  • Computer Science Applications
  • Artificial Intelligence
  • Electrical and Electronic Engineering

Fingerprint

Dive into the research topics of 'Automatic detection of translation direction'. Together they form a unique fingerprint.

Cite this