In this paper, we propose the use of automatic text classification methods to analyse variation in English-German translations from both a quantitative and a qualitative perspective. The experiments described in this paper are carried out in two steps. We trained classifiers to 1) discriminate between different genres (fiction, political essays, etc.); and 2) identify the translation method (machine vs. human). Using semi-delexicalized models (excluding all nouns), we report results of up to 60.5% F-measure in distinguishing human and machine translations and 45.4% in discriminating between seven different genres. More than the classification performance itself, we argue that text classification methods can level out discriminative features of different variables (genres and translation methods) thus enabling researchers to investigate in more detail the properties of each of them.
展开▼