首页> 外文期刊>CONCURRENCY PRACTICE & EXPERIENCE >EcForest: Extractive document summarization through enhanced sentence embedding and cascade forest
【24h】

EcForest: Extractive document summarization through enhanced sentence embedding and cascade forest

机译:EcForest:通过增强的句子嵌入和级联林来提取文档摘要

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

We present EcForest, an extractive summarization model through Enhanced SentenceEmbedding and Cascade Forest. Sentence representation is of great significance for many summarizationmethods. Bag-of-words mostly fails to grasp the semantics, and typical embeddingmodels cannot capture more complex semantic features, such as polysemy and the meaningof a phrase, which is usually ignored by simply averaging the word embeddings included in asentence. To this end, we propose Enhanced Sentence Embedding (ESE) model to solve suchdrawbacks via mapping several valid features to dense vectors. Essentially, the enhanced sentenceembedding is a novel model for improving the distributed representation of sentence.Our sentence embedding model is universally applicable and it can be adapted to other NLPtasks. Moreover, deep forest is used as a sentence extraction algorithm for its robustness tothe hyper-parameters and its efficient training algorithm compared to deep neural network.The evaluation of variant models proposed in this work proves the validation of the enhancedsentence embedding. The comparison results between EcForest and several baselines on twodifferent datasets demonstrate that the proposed summarization model performs better thanor with high competitiveness to the state-of-the-art.
机译:我们介绍了EcForest,这是一种通过增强句法 r nEmbedding和Cascade Forest提取的摘要模型。句子表示对于许多总结方法都具有重要意义。词袋通常无法理解语义,典型的嵌入 r n模型无法捕获更复杂的语义特征,例如多义词和短语的意思 r n,通常通过简单地对包含的词嵌入进行平均即可将其忽略在句子。为此,我们提出了增强句子嵌入(ESE)模型,以通过将几个有效特征映射到密集向量来解决这种回退。从本质上讲,增强型句子嵌入是一种改进句子的分布式表示的新颖模型。 r n我们的句子嵌入模型具有普遍适用性,并且可以适用于其他NLP r n任务。此外,与深层神经网络相比,深林被用作句子提取算法,因为它对超参数具有鲁棒性,并且与深层神经网络相比,具有高效的训练算法。 r n对本文提出的变体模型的评估证明了该方法的有效性。增强的 n 句子嵌入。 EcForest与两个不同数据集上的几个基线之间的比较结果表明,所提出的汇总模型的性能优于或优于现有技术,并且具有较高的竞争力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号