首页> 外文期刊>ACM transactions on Asian language information processing >Discriminative Training for Log-Linear Based SMT: Global or Local Methods
【24h】

Discriminative Training for Log-Linear Based SMT: Global or Local Methods

机译:基于对数线性的SMT的歧视性培训:全局或局部方法

获取原文
获取原文并翻译 | 示例
       

摘要

In statistical machine translation, the standard methods such as MERT tune a single weight with regard to a given development data. However, these methods suffer from two problems due to the diversity and uneven distribution of source sentences. First, their performance is highly dependent on the choice of a development set, which may lead to an unstable performance for testing. Second, the sentence level translation quality is not assured since tuning is performed on the document level rather than on sentence level. In contrast with the standard global training in which a single weight is learned, we propose novel local training methods to address these two problems. We perform training and testing in one step by locally learning the sentence-wise weight for each input sentence. Since the time of each tuning step is unnegligible and learning sentence-wise weights for the entire test set means many passes of tuning, it is a great challenge for the efficiency of local training. We propose an efficient two-phase method to put the local training into practice by employing the ultraconservative update. On NIST Chinese-to-English translation tasks with both medium and large scales of training data, our local training methods significantly outperform standard methods with the maximal improvements up to 2.0 BLEU points, meanwhile their efficiency is comparable to that of the standard methods.
机译:在统计机器翻译中,标准方法(例如MERT)针对给定的开发数据调整单个权重。但是,由于源句的多样性和分布不均,这些方法存在两个问题。首先,它们的性能在很大程度上取决于开发集的选择,这可能导致测试性能不稳定。其次,由于在文档级别而不是句子级别执行调整,因此无法确保句子级别的翻译质量。与学习单个权重的标准全局培训相反,我们提出了新颖的本地培训方法来解决这两个问题。通过本地学习每个输入句子的逐字加权,我们一步一步地进行了训练和测试。由于每个调整步骤的时间都可以忽略不计,并且学习整个测试集的逐句权重意味着需要进行多次调整,这对本地培训的效率提出了巨大挑战。我们提出了一种有效的两阶段方法,通过采用超保守更新将本地培训付诸实践。在具有中型和大型训练数据的NIST汉英翻译任务中,我们的本地训练方法明显优于标准方法,最大改进为2.0 BLEU点,同时其效率可与标准方法媲美。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号