Incorporating Linguistic Knowledge in Statistical Machine Translation: Translating Prepositions

Reshef Shilon, Hanna Fadida, Shuly Wintner

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Prepositions are hard to translate, because their meaning is often vague, and the choice of the correct preposition is often arbitrary. At the same time, making the correct choice is often critical to the coherence of the output text. In the context of statistical machine translation, this difficulty is enhanced due to the possible long distance between the preposition and the head it modifies, as opposed to the local nature of standard language models. In this work we use mono-
lingual language resources to determine the set of prepositions that are most likely to occur with each verb. We use this information in a transfer-based Arabic-to-Hebrew statistical machine translation system. We show that incorporating linguistic knowledge on the distribution of prepositions significantly improves the translation quality.
Original languageEnglish
Title of host publicationProceedings of the Workshop on Innovative Hybrid Approaches to the Processing of Textual Data
Place of PublicationAvignon, France
PublisherAssociation for Computational Linguistics
Pages106-114
Number of pages9
StatePublished - 1 Apr 2012

Fingerprint

Dive into the research topics of 'Incorporating Linguistic Knowledge in Statistical Machine Translation: Translating Prepositions'. Together they form a unique fingerprint.

Cite this