【24h】

Word translation based on machine learning models using translation memory and corpora

机译:基于机器学习模型的词语翻译,使用翻译记忆库和语料

获取原文
获取原文并翻译 | 示例
       

摘要

The second contest on word sense disambiguation, SENSEVAL-2, was held in Spring, 2001. It consists of several tasks in various languages. In this paper, we describe our system that is used for one of these tasks: the Japanese translation task. In this task, senses of a word are defined in terms of the word's translations. Given an input sentence and a target word in the sentence, our system first estimates the similarity between the input sentence and parallel example sets called translation memory. It then selects an appropriate translation of the target word by using the example set with the highest similarity. The similarity is calculated using dynamic programming and a machine learning model which uses the following features: similarity of a string, words to the left and to the right of the target word in the input sentence, content words in the input sentence and their translations, and co-occurrence of content words in bilingual and monolingual corpora in English and Japanese. Our system achieves an accuracy of 63.4%, finishing the contest in third place among nine systems developed by seven groups.
机译:第二次关于言论歧义,SenseVal-2的比赛在2001年春季举行。它包括各种语言的几个任务。在本文中,我们描述了我们的系统,用于其中一个任务:日本翻译任务。在此任务中,在单词的翻译方面定义了一个单词的感官。给定句子中的输入句子和目标字,我们的系统首先估计输入句子和并行示例集之间称为翻译存储器的相似性。然后,它通过使用具有最高相似性的示例集来选择目标单词的适当平移。使用动态编程和机器学习模型计算相似性,它使用以下特征:字符串的相似性,在输入句中的目标单词左侧和右侧的单词,输入句子中的内容词及其翻译,英语和日语双语和单声道语料中内容词的共同发生。我们的系统准确性为63.4%,在七组开发的九个系统中完成第三位的比赛。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号