Language models for machine translation: Original vs. translated texts

Gennadi Lembersky, Noam Ordan, Shuly Wintner

Research output: Contribution to journalArticlepeer-review

Abstract

We investigate the differences between language models compiled from original target-language texts and those compiled from texts manually translated to the target language. Corroborating established observations of Translation Studies, we demonstrate that the latter are significantly better predictors of translated sentences than the former, and hence fit the reference set better. Furthermore, translated texts yield better language models for statistical machine translation than original texts.

Original languageEnglish
Pages (from-to)799-825
Number of pages27
JournalComputational Linguistics
Volume38
Issue number4
DOIs
StatePublished - Dec 2012

ASJC Scopus subject areas

  • Language and Linguistics
  • Linguistics and Language
  • Computer Science Applications
  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Language models for machine translation: Original vs. translated texts'. Together they form a unique fingerprint.

Cite this