首页> 外文期刊>Applied Soft Computing >Predicting insertion positions in word-level machine translation quality estimation
【24h】

Predicting insertion positions in word-level machine translation quality estimation

机译:预测词级机器翻译质量估计中的插入位置

获取原文
获取原文并翻译 | 示例
           

摘要

Word-level machine translation (MT) quality estimation (QE) is usually formulated as the task of automatically identifying which words need to be edited (either deleted or replaced) in a translation T produced by an MT system. The advantage of estimating MT quality at the word level is that this information can be used to guide post-editors since it enables the identification of the specific words in T that need to be edited in order to ease their work. However, word-level MT QE, as defined in the current literature, has an obvious limitation: it does not identify the positions in T in which missing words need to be inserted. To deal with this limitation, we propose a method which identifies both word deletions and insertion positions in T. This is, to the best of our knowledge, the first approach allowing the identification of insertion positions in word-levelMTQE. The method proposed can use any source of bilingual information-such as MT, dictionaries, or phrase-level translation memories - to extract features that are then used by a neural network to produce a prediction for both words and insertion positions (gaps between words) in the translation T. In this paper, several feature sets and neural network architectures are explored and evaluated on publicly-available datasets used in previous evaluation campaigns for word-levelMTQE. The results confirm the feasibility of the proposed approach, as well as the usefulness of sharing information between the two prediction tasks in order to obtain more reliable quality estimations. (C) 2018 Elsevier B.V. All rights reserved.
机译:字级机器翻译(MT)质量估计(QE)通常被制定为自动识别MT系统产生的翻译T中需要编辑的(删除或替换)的任务。估计Word级别的MT质量的优点是该信息可用于引导后编辑器,因为它能够在需要编辑的T中的特定单词以便缓解他们的工作。但是,如当前文献中所定义的单词级MT QE具有明显的限制:它不会识别需要插入缺失单词的T中的位置。为了处理此限制,我们提出了一种方法,该方法识别T.这是我们所知的最佳删除和插入位置的方法,这是允许在Word-LevelMTQE中识别插入位置的第一方法。所提出的方法可以使用双语信息的任何源 - 例如MT,词典或短语级转换存储器 - 以提取神经网络使用的特征,以产生对单词和插入位置的预测(单词之间的间隙)在翻译T.本文中,探索了几种特征集和神经网络架构,并在以前的评估广告活动中用于Word-LevelMTQE的可公开数据集。结果证实了所提出的方法的可行性,以及在两个预测任务之间共享信息的有用性,以获得更可靠的质量估算。 (c)2018 Elsevier B.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号