首页> 外文期刊>ACM transactions on Asian and low-resource language information processing >Layer-Wise De-Training and Re-Training for ConvS2S Machine Translation
【24h】

Layer-Wise De-Training and Re-Training for ConvS2S Machine Translation

机译:Divers2s机器翻译的层展解除培训和重新培训

获取原文
获取原文并翻译 | 示例
       

摘要

The convolutional sequence-to-sequence (ConvS2S) machine translation system is one of the typical neural machine translation (NMT) systems. Training the ConvS2S model tends to get stuck in a local optimum in our pre-studies. To overcome this inferior behavior, we propose to de-train a trained ConvS2S model in a mild way and retrain to find a better solution globally. In particular, the trained parameters of one layer of the NMT network are abandoned by re-initialization while other layers' parameters are kept at the same time to kick off re-optimization from a new start point and safeguard the new start point not too far from the previous optimum. This procedure is executed layer by layer until all layers of the ConvS2S model are explored. Experiments show that when compared to various measures for escaping from the local optimum, including initialization with random seeds, adding perturbations to the baseline parameters, and continuing training (con-training) with the baseline models, our method consistently improves the ConvS2S translation quality across various language pairs and achieves better performance.
机译:卷积序列到序列(CONFS2S)机器翻译系统是典型的神经电机翻译(NMT)系统之一。培训CONVS2S模型往往会在我们的预先研究中陷入当地最佳状态。为了克服这种劣等的行为,我们建议以温和的方式训练训练有素的Convs2s模型,并培训来寻找全球更好的解决方案。特别地,一层NMT网络的训练参数被重新初始化放弃,而其他层次的参数同时保持从新的起点开始重新优化并保护新的起点而不是太远从以前的最佳选择。此过程按层次执行图层,直到探索了CONVS2S模型的所有图层。实验表明,与各种措施相比,从局部最佳逃脱,包括随机种子的初始化,与基线参数的扰动增加,以及随着基线模型的持续训练(Con-Training),我们的方法一致地改善了Convs2S翻译质量各种语言对并实现更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号