首页> 外文期刊>Audio, Speech, and Language Processing, IEEE/ACM Transactions on >Translation Quality Estimation Using Only Bilingual Corpora
【24h】

Translation Quality Estimation Using Only Bilingual Corpora

机译:仅使用双语语料库的翻译质量估计

获取原文
获取原文并翻译 | 示例

摘要

In computer-aided translation scenarios, quality estimation of machine translation hypotheses plays a critical role. Existing methods for word-level translation quality estimation (TQE) rely on the availability of manually annotated TQE training data obtained via direct annotation or postediting. However, due to the cost of human labor, such data are either limited in size or is only available for few tasks in practice. To avoid the reliance on such annotated TQE data, this paper proposes an approach to train word-level TQE models using bilingual corpora, which are typically used in machine translation training and is relatively easier to access. We formalize the training of our proposed method under the framework of maximum marginal likelihood estimation. To avoid degenerated solutions, we propose a novel regularized training objective whose optimization is achieved by an efficient approximation. Extensive experiments on both written and spoken language datasets empirically show that our approach yields comparable performance to the standard training on annotated data.
机译:在计算机辅助翻译方案中,机器翻译假设的质量估计起着至关重要的作用。现有的词级翻译质量估计(TQE)方法依赖于通过直接注释或postiting获得的手动注释TQE训练数据的可用性。但是,由于人力成本,此类数据要么数量有限,要么只能用于实践中的少量任务。为了避免依赖此类带注释的TQE数据,本文提出了一种使用双语语料库训练单词级TQE模型的方法,该模型通常用于机器翻译训练中,并且相对易于访问。我们在最大边际似然估计的框架下形式化我们提出的方法的训练。为了避免退化的解决方案,我们提出了一种新颖的正则化训练目标,该目标的优化是通过有效逼近来实现的。对书面和口头语言数据集的大量实验从经验上表明,我们的方法所产生的效果可与带注释数据的标准训练相媲美。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号