【24h】

Word translation based on machine learning models using translation memory and corpora

机译:使用翻译记忆库和语料库基于机器学习模型的单词翻译

获取原文
获取原文并翻译 | 示例
           

摘要

The second contest on word sense disambiguation, SENSEVAL-2, was held in Spring, 2001. It consists of several tasks in various languages. In this paper, we describe our system that is used for one of these tasks: the Japanese translation task. In this task, senses of a word are defined in terms of the word's translations. Given an input sentence and a target word in the sentence, our system first estimates the similarity between the input sentence and parallel example sets called translation memory. It then selects an appropriate translation of the target word by using the example set with the highest similarity. The similarity is calculated using dynamic programming and a machine learning model which uses the following features: similarity of a string, words to the left and to the right of the target word in the input sentence, content words in the input sentence and their translations, and co-occurrence of content words in bilingual and monolingual corpora in English and Japanese. Our system achieves an accuracy of 63.4%, finishing the contest in third place among nine systems developed by seven groups.
机译:第二届词义歧义竞赛SENSEVAL-2,于2001年春季举行。该竞赛包含多种语言的多项任务。在本文中,我们描述了用于以下任务之一的系统:日语翻译任务。在此任务中,根据单词的翻译定义单词的含义。给定输入句子和句子中的目标词,我们的系统首先估算输入句子与并行实例集(称为翻译记忆库)之间的相似性。然后,通过使用具有最高相似度的示例集来选择目标单词的适当翻译。使用动态编程和机器学习模型来计算相似度,该模型具有以下特征:字符串相似度,输入句子中目标词左侧和右侧的单词,输入句子中的内容词及其翻译,以及英语和日语双语语料库中的内容词同时出现。我们的系统达到63.4%的准确性,在由七个小组开发的九个系统中排名第三。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号