首页> 外文期刊>Computer speech and language >Batch-mode semi-supervised active learning for statistical machine translation
【24h】

Batch-mode semi-supervised active learning for statistical machine translation

机译:用于统计机器翻译的批处理模式半监督主动学习

获取原文
获取原文并翻译 | 示例

摘要

The development of high-performance statistical machine translation (SMT) systems is contingent on the availability of substantial, in-domain parallel training corpora. The latter, however, are expensive to produce due to the labor-intensive nature of manual translation. We propose to alleviate this problem with a novel, semi-supervised, batch-mode active learning strategy that attempts to maximize in-domain coverage by selecting sentences, which represent a balance between domain match, translation difficulty, and batch diversity. Simulation experiments on an English-to-Pashto translation task show that the proposed strategy not only outperforms the random selection baseline, but also traditional active selection techniques based on dissimilarity to existing training data.
机译:高性能统计机器翻译(SMT)系统的开发取决于是否有大量的领域内并行训练语料库。但是,由于人工翻译的劳动强度大,后者的生产成本很高。我们建议通过一种新颖的,半监督的,批处理模式的主动学习策略来缓解此问题,该策略试图通过选择句子来最大化域内覆盖率,这些句子代表了域匹配,翻译难度和批处理多样性之间的平衡。在英语到普什图语翻译任务上的仿真实验表明,所提出的策略不仅优于随机选择基准,而且优于基于现有训练数据的传统主动选择技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号