首页> 外文会议>Conference on empirical methods in natural language processing >Learning Universal Sentence Representations with Mean-Max Attention Autoencoder
【24h】

Learning Universal Sentence Representations with Mean-Max Attention Autoencoder

机译:使用均值-最大注意力自动编码器学习通用句子表示

获取原文

摘要

In order to learn universal sentence representations, previous methods focus on complex recurrent neural networks or supervised learning. In this paper, we propose a mean-max attention autoencoder (mean-max AAE) within the encoder-decoder framework. Our autoencoder rely entirely on the MultiHead self-attention mechanism to reconstruct the input sequence. In the encoding we propose a mean-max strategy that applies both mean and max pooling operations over the hidden vectors to capture diverse information of the input. To enable the information to steer the reconstruction process dynamically, the decoder performs attention over the mean-max representation. By training our model on a large collection of unlabelled data, we obtain high-quality representations of sentences. Experimental results on a broad range of 10 transfer tasks demonstrate that our model outperforms the state-of-the-art unsupervised single methods, including the classical skip-thoughts (Kiros et al., 2015) and the advanced skip-thoughts+LN model (Ba et al., 2016). Furthermore, compared with the traditional recurren-t neural network, our mean-max AAE greatly reduce the training time.
机译:为了学习通用的句子表示,以前的方法着重于复杂的递归神经网络或监督学习。在本文中,我们提出了一种在编码器-解码器框架内的均值-最大注意力自动编码器(mean-max AAE)。我们的自动编码器完全依靠MultiHead自注意力机制来重构输入序列。在编码中,我们提出了一种均值最大策略,该策略对隐藏矢量应用均值和最大池化操作以捕获输入的各种信息。为了使信息能够动态地引导重建过程,解码器对均值-最大表示进行关注。通过在大量未标记数据上训练模型,我们可以获得句子的高质量表示。在10个传输任务的广泛范围上的实验结果表明,我们的模型优于最新的无监督单一方法,包括经典的“跳过思想”(Kiros等人,2015)和高级的“跳过思想+ LN”模型(Ba et al。,2016)。此外,与传统的递归神经网络相比,我们的均值最大AAE大大减少了训练时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号