CGSPN : cascading gated self-attention and phrase-attention network for sentence modeling

Fu Yanping; Liu Yun

首页> 外文期刊>Journal of Intelligent Information Systems >CGSPN : cascading gated self-attention and phrase-attention network for sentence modeling

【24h】

CGSPN : cascading gated self-attention and phrase-attention network for sentence modeling

机译：CGSPN：级联门控自我关注和短语 - 关注网络用于句子建模

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Sentence modeling is a critical issue for the feature generation of some natural language processing (NLP) tasks. Recently, most works generated the sentence representation by sentence modeling based on Convolutional Neural Network (CNN), Long Short-Term Memory network (LSTM) and some attention mechanisms. However, these models have two limitations: (1) they only present sentences for one individual task by fine-tuning network parameters, and (2) sentence modeling only considers the concatenation of words and ignores the function of phrases. In this paper, we propose a Cascading Gated Self-attention and Phrase-attention Network (CGSPN) that generates the sentence embedding by considering contextual words and key phrases in a sentence. Specifically, we first present a word-interaction gating self-attention mechanism to identify some important words and build the relationship between words. Then, we cascade a phrase-attention structure by abstracting the semantic of phrases to generate the sentence representation. Experiments on different NLP tasks show that the proposed CGSPN model achieves the highest accuracy among most sentence encoding methods. It improves the latest best result by 1.76% on the Stanford Sentiment Treebank (SST), and shows the best test accuracy on different sentence classification data sets. In the Natural Language Inference (NLI) task, the performance of CGSPN without phrase-attention is better than CGSPN model itself and it obtains competitive performance against state-of-the-art baselines, which show the different applicability of the proposed model. In other NLP tasks, we also compare our model with popular methods to explore our direction.

机译：句子建模是某些自然语言处理（NLP）任务的特征生成的关键问题。最近，大多数作品通过基于卷积神经网络（CNN），长短短期存储器网络（LSTM）和一些注意机制来生成句子表示的句子表示。但是，这些模型具有两个限制：（1）它们仅通过微调网络参数向一个单独任务提供句子，（2）句子建模仅考虑单词的连接并忽略短语的功能。在本文中，我们提出了一种级联的门控自我关注和短语 - 注意网络（CGSPN），通过考虑句子中的上下文单词和关键短语来嵌入句子。具体地，我们首先介绍一个词交互的语言自我关注机制，以识别一些重要的单词并构建单词之间的关系。然后，我们通过抽象语义来级联一个短语关注结构来生成句子表示。不同NLP任务的实验表明，所提出的CGSPN模型在大多数句子编码方法中实现了最高精度。它在斯坦福州情绪树班（SST）上提高了1.76％的最佳结果，并显示了不同句子分类数据集的最佳测试准确性。在自然语言推理（NLI）任务中，CGSPN的性能没有短语 - 关注的优于CGSPN模型本身，它获得了针对最先进的基线的竞争性能，这表明了所提出的模型的不同适用性。在其他NLP任务中，我们还将模型与流行的方法进行比较，探索我们的方向。

著录项

来源
《Journal of Intelligent Information Systems》 |2021年第1期|147-168|共22页
作者
Fu Yanping; Liu Yun;
展开▼
作者单位

Beijing Jiaotong Univ Sch Elect & Informat Engn Beijing 100044 Peoples R China;

Beijing Jiaotong Univ Sch Elect & Informat Engn Beijing 100044 Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Sentence modeling; Gated self-attention; CNN; Phrase-attention mechanism; NLP;

机译：句子建模;门控;CNN;短语 - 注意机制;NLP;

相似文献

外文文献
中文文献
专利

1. Soft sensor based on extreme gradient boosting and bidirectional converted gates long short-term memory self-attention network [J] . Zhu Xiuli, Hao Kuangrong, Xie Ruimin, Neurocomputing . 2021,第Apra28期

机译：基于极端梯度升压和双向转换闸门的软传感器长短期内存自我关注网络
2. A fast self-attention cascaded network for object detection in large scene remote sensing images [J] . Hua Xia, Wang Xinqing, Rui Ting, Applied Soft Computing . 2020,第1期

机译：大型场景遥感图像中的对象检测快速自我关注级联网络
3. Crowd counting using a self-attention multi-scale cascaded network [J] . He Li, Shihui Zhang, Weihang Kong Computer Vision, IET . 2019,第6期

机译：人群使用自我关注多级级联网络计数
4. Sentence Modeling with Gated Recursive Neural Network [C] . Xinchi Chen, Xipeng Qiu, Chenxi Zhu, Conference on empirical methods in natural language processing . 2015

机译：门控递归神经网络的句子建模
5. A Study of Computational Problems in Computational Biology and Social Networks: Cancer Informatics and Cascade Modelling [D] . Ma, Christopher. 2018

机译：计算生物学和社会网络中的计算问题研究：癌症信息学和级联模型
6. Cascaded logic gates in nanophotonic plasmon networks [O] . Hong Wei, Zhuoxian Wang, Xiaorui Tian, -1

机译：纳米光子等离激元网络中的级联逻辑门
7. Comparing gated and simple recurrent neural network architectures as models of human sentence processing [O] . Christoph Aurnhammer, Stefan L. Frank 2018

机译：比较门控和简单的经常性神经网络架构作为人类句子处理的模型

CGSPN : cascading gated self-attention and phrase-attention network for sentence modeling

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅