【24h】

Entity Commonsense Representation for Neural Abstractive Summarization

机译:神经抽象总结的实体常识表示

获取原文

摘要

A major proportion of a text summary includes important entities found in the original text. These entities build up the topic of the summary. Moreover, they hold commonsense information once they are linked to a knowledge base. Based on these observations, this paper investigates the usage of linked entities to guide the decoder of a neural text summarizer to generate concise and better summaries. To this end, we leverage on an off-the-shelf entity linking system (ELS) to extract linked entities and propose Entity2Topic (E2T), a module easily attachable to a sequence-to-sequence model that transforms a list of entities into a vector representation of the topic of the summary. Current available ELS's are still not sufficiently effective, possibly introducing unresolved ambiguities and irrelevant entities. We resolve the imperfections of the ELS by (a) encoding entities with selective disambiguation, and (b) pooling entity vectors using firm attention. By applying E2T to a simple sequence-to-sequence model with attention mechanism as base model, we see significant improvements of the performance in the Gigaword (sentence to title) and CNN (long document to multi-sentence highlights) summarization datasets by at least 2 ROUGE points.
机译:文本摘要的主要部分包括原始文本中发现的重要实体。这些实体构成摘要的主题。而且,一旦它们链接到知识库,它们就会保留常识信息。基于这些观察,本文研究了链接实体的用法,以指导神经文本摘要生成器的解码器生成简洁,更好的摘要。为此,我们利用现成的实体链接系统(ELS)提取链接的实体,并提出Entity2Topic(E2T),这是一个易于附加到序列到序列模型的模块,该模型将实体列表转换为摘要主题的矢量表示。当前可用的ELS仍然不够有效,可能会引入未解决的歧义和不相关的实体。我们通过以下方式解决ELS的缺陷:(a)编码具有选择性歧义的实体,以及(b)使用公司的注意力合并实体向量。通过将E2T应用于以注意力机制为基础模型的简单序列到序列模型,我们至少看到了Gigaword(句子标题)和CNN(长文档到多句子重点)摘要数据集的性能显着提高。 2 ROUGE点。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号