One of the limitations of translation memory systems is that the smallest translation units currently accessible are aligned sentential pairs. We propose an example-based machine translation system which uses a 'phrasal lexicon' in addition to the aligned sentences in its database. These phrases are extracted from the Penn Treebank using the Marker Hypothesis as a constraint on segmentation. They are then translated by three on-line machine translation (MT) systems, and a number of linguistic resources are automatically constructed which are used in the translation of new input.udWe perform two experiments on testsets of sentences and noun phrases to demonstrate the effectiveness of our system. In so doing, we obtain insights into the strengths and weaknesses of the selected on-line MT systems. Finally, like many example-based machine translation systems, our approach also suffers from the problem of ‘boundary friction’. Where the quality of resulting translations is compromised as a result, we use a novel, post hoc validation procedure via the World Wide Web to correct imperfect translations prior to their being output to the user.ud
展开▼
机译:翻译记忆库系统的局限性之一是当前可访问的最小翻译单元是对齐的句子对。我们提出了一个基于示例的机器翻译系统,该系统在数据库中除了对齐的句子外还使用“短语词典”。这些短语是使用标记假设作为分割约束从Penn树库中提取的。然后,它们通过三个在线机器翻译(MT)系统进行翻译,并自动构建了许多语言资源,用于新输入的翻译。 ud我们对句子和名词短语的测试集进行了两个实验,以证明我们系统的有效性。这样,我们就可以了解选定的在线MT系统的优势和劣势。最后,就像许多基于示例的机器翻译系统一样,我们的方法也存在“边界摩擦”的问题。如果结果翻译的质量受到影响,我们将通过万维网使用新颖的事后验证程序来纠正不完善的翻译,然后再将其输出给用户。 ud
展开▼