【24h】

Classical Structured Prediction Losses for Sequence to Sequence Learning

机译:序列序列学习的经典结构化预测损失

获取原文

摘要

There has been much recent work on training neural attention models at the sequence-level using either reinforcement learning-style methods or by optimizing the beam. In this paper, we survey a range of classical objective functions that have been widely used to train linear models for structured prediction and apply them to neural sequence to sequence models. Our experiments show that these losses can perform surprisingly well by slightly outperforming beam search optimization in a like for like setup. We also report new state of the art results on both IWSLT'14 German-English translation as well as Giga-word abstractive summarization. On the large WMT'14 English-French task, sequence-level training achieves 41.5 BLEU which is on par with the state of the art.
机译:最近在序列级别使用强化学习式方法或优化光束培训了序列级别的训练模型。在本文中,我们调查了一系列已广泛用于培训用于结构化预测的线性模型的一系列经典客观功能,并将其应用于序列模型的神经序列。我们的实验表明,这些损失可以通过略微优先表现出类似的光束搜索优化来表现出令人惊讶的方式。我们还向IWSLT'14德语翻译以及Giga-Word抽象摘要报告了新的最新技术结果。在大型WMT'14英国法国任务中,序列级培训实现41.5 BLEU,与现有技术相提并论。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号