首页> 外文会议>International conference on recent advances in natural language processing >Naive Regularizers for Low-Resource Neural Machine Translation
【24h】

Naive Regularizers for Low-Resource Neural Machine Translation

机译:低资源神经机翻译的天真常规

获取原文

摘要

Neural machine translation models have little inductive bias, which can be a disadvantage in low-resource scenarios. They require large volumes of data and often perform poorly when limited data is available. We show that using naive regular-ization methods, based on sentence length, punctuation and word frequencies, to penalize translations that are very different from the input sentences, consistently improves the translation quality across multiple low-resource languages. We experiment with 12 language pairs, varying the training data size between 17k to 230k sentence pairs. Our best regularizer achieves an average increase of 1.5 BLEU score and 1.0 TER score across all the language pairs. For example, we achieve a BLEU score of 26.70 on the IWSLT15 English—Vietnamese translation task simply by using relative differences in punctuation as a regularizer.
机译:神经电机翻译模型具有很小的感应偏差,这可能是低资源场景的缺点。它们需要大量数据,并且当有限数据可用时经常执行差。我们表明,使用Naive常规方法,基于句子长度,标点符号和字频率,惩罚与输入句子截然不同的翻译,始终如一地提高多种低资源语言的翻译质量。我们试验12对语言对,改变17K至230K句子对之间的培训数据大小。我们最好的规则器实现平均增加1.5 BLEU评分和所有语言对的1.0倍。例如,我们在IWSLT15英语 - 越南翻译任务中实现了26.70的Bleu得分,只需使用标点符号作为常规器的相对差异。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号