首页> 外文会议>Annual meeting of the Association for Computational Linguistics >Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms
【24h】

Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms

机译:基线需要更多的爱:基于简单词嵌入的模型和关联的池化机制

获取原文

摘要

Many deep learning architectures have been proposed to model the composition-ality in text sequences, requiring a substantial number of parameters and expensive computations. However, there has not been a rigorous evaluation regarding the added value of sophisticated compositional functions. In this paper, we conduct a point-by-point comparative study between Simple Word-Embedding-based Models (SWEMs), consisting of parameter-free pooling operations, relative to word-embedding-based RNN/CNN models. Surprisingly, SWEMs exhibit comparable or even superior performance in the majority of cases considered. Based upon this understanding, we propose two additional pooling strategies over learned word embeddings: (ⅰ) a max-pooling operation for improved interpretability; and (ⅱ) a hierarchical pooling operation, which preserves spatial (n-gram) information within text sequences. We present experiments on 17 datasets encompassing three tasks: (ⅰ) (long) document classification; (ⅱ) text sequence matching; and (ⅲ) short text tasks, including classification and tagging.
机译:已经提出了许多深度学习架构来对文本序列中的构图性进行建模,这需要大量的参数和昂贵的计算。但是,尚未对复杂的合成功能的附加值进行严格的评估。在本文中,我们相对于基于词嵌入的RNN / CNN模型,对基于简单词嵌入的模型(SWEM)进行了逐点比较研究,该模型由无参数的池化操作组成。令人惊讶的是,SWEM在大多数情况下都表现出相当甚至更好的性能。基于这种理解,我们针对学习的单词嵌入提出了两种附加的合并策略:(ⅰ)最大池操作以提高可解释性; (ⅱ)分层池化操作,可在文本序列内保留空间(n元语法)信息。我们目前在包含三个任务的17个数据集上进行实验:(ⅰ)(长)文档分类; (ⅱ)文字序列匹配; (ⅲ)短文本任务,包括分类和标记。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号