首页> 外文会议>Conference on empirical methods in natural language processing >An Analysis of Encoder Representations in Transformer-Based Machine Translation
【24h】

An Analysis of Encoder Representations in Transformer-Based Machine Translation

机译:基于变压器的机器翻译中的编码器表示分析

获取原文

摘要

The attention mechanism is a successful technique in modern NLP, especially in tasks like machine translation. The recently proposed network architecture of the Transformer is based entirely on attention mechanisms and achieves new state of the art results in neural machine translation, outperforming other sequence-to-sequence models. However, so far not much is known about the internal properties of the model and the representations it learns to achieve that performance. To study this question, we investigate the information that is learned by the attention mechanism in Transformer models with different translation quality. We assess the representations of the encoder by extracting dependency relations based on self-attention weights, we perform four probing tasks to study the amount of syntactic and semantic captured information and we also test attention in a transfer learning scenario. Our analysis sheds light on the relative strengths and weaknesses of the various encoder representations. We observe that specific attention heads mark syntactic dependency relations and we can also confirm that lower layers tend to learn more about syntax while higher layers tend to encode more semantics.
机译:注意机制是现代NLP中的成功技术,特别是在机器翻译中的任务中。最近提出的变压器网络架构完全基于注意机制,实现了神经电机翻译的新状态,优于其他序列到序列模型。然而,到目前为止,关于模型的内部属性以及它学会实现这种性能的陈述的知识并不多。要研究这个问题,我们调查了具有不同翻译质量的变压器模型中的注意机制学到的信息。我们通过基于自我注意重量提取依赖关系来评估编码器的表示,我们执行四个探测任务以研究句法和语义捕获信息的数量,并在转移学习场景中进行关注。我们的分析揭示了各种编码器表示的相对优势和弱点。我们观察到,具体的注意力头标记句法依赖关系,我们还可以确认较低层倾向于了解有关语法的更多信息,而较高的层倾向于编码更多语义。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号