【24h】

Improving Multi-head Attention with Capsule Networks

机译:通过胶囊网络提高多头注意力

获取原文

摘要

Multi-head attention advances neural machine translation by working out multiple versions of attention in different subspaces, but the neglect of semantic overlapping between subspaces increases the difficulty of translation and consequently hinders the further improvement of translation performance. In this paper, we employ capsule networks to comb the information from the multiple heads of the attention so that similar information can be clustered and unique information can be reserved. To this end, we adopt two routing mechanisms of Dynamic Routing and EM Routing, to fulfill the clustering and separating. We conducted experiments on Chinese-to-English and English-to-German translation tasks and got consistent improvements over the strong Transformer baseline.
机译:多头注意力通过在不同的子空间中计算出多种形式的注意力来促进神经机器翻译,但是忽略子空间之间的语义重叠会增加翻译的难度,从而阻碍翻译性能的进一步提高。在本文中,我们采用胶囊网络组合来自多个注意点的信息,以便可以对相似信息进行聚类并保留唯一信息。为此,我们采用动态路由和EM路由这两种路由机制来实现聚类和分离。我们进行了中文到英语和英语到德语翻译任务的实验,并在强大的Transformer基准上获得了持续改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号