Much research in translation studies indicates that translated texts are ontologically different from original non-translated ones. Translated texts, in any language, can be considered a dialect of that language, known as 'translationese'. Several characteristics of translationese have been proposed as universal in a series of hypotheses. In this work, we test these hypotheses using a computational methodology that is based on supervised machine learning. We define several classifiers that implement various linguistically informed features, and assess the degree to which different sets of features can distinguish between translated and original texts. We demonstrate that some feature sets are indeed good indicators of translationese, thereby corroborating some hypotheses, whereas others perform much worse (sometimes at chance level), indicating that some 'universal' assumptions have to be reconsidered. In memoriam: Miriam Shlesinger, 1947-2012.
Bibliographical notePublisher Copyright:
© The Author 2013. Published by Oxford University Press on behalf of EADH.
ASJC Scopus subject areas
- Information Systems
- Language and Linguistics
- Linguistics and Language
- Computer Science Applications