Document Summarization Using Sentence-Level Semantic Based on Word Embeddings

Al-Sabahi Kamal; Zhang Zuping

首页> 外文期刊>International journal of software engineering and knowledge engineering >Document Summarization Using Sentence-Level Semantic Based on Word Embeddings

【24h】

Document Summarization Using Sentence-Level Semantic Based on Word Embeddings

机译：基于词嵌入的句子级语义的文档摘要

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

In the era of information overload, text summarization has become a focus of attention in a number of diverse fields such as, question answering systems, intelligence analysis, news recommendation systems, search results in web search engines, and so on. A good document representation is the key point in any successful summarizer. Learning this representation becomes a very active research in natural language processing field (NLP). Traditional approaches mostly fail to deliver a good representation. Word embedding has proved an excellent performance in learning the representation. In this paper, a modified BM25 with Word Embeddings are used to build the sentence vectors from word vectors. The entire document is represented as a set of sentence vectors. Then, the similarity between every pair of sentence vectors is computed. After that, TextRank, a graph-based model, is used to rank the sentences. The summary is generated by picking the top-ranked sentences according to the compression rate. Two well-known datasets, DUC2002 and DUC2004, are used to evaluate the models. The experimental results show that the proposed models perform comprehensively better compared to the state-of-the-art methods.

机译：在信息过载的时代，文本摘要已成为许多不同领域的关注焦点，例如问题解答系统，情报分析，新闻推荐系统，Web搜索引擎中的搜索结果等。良好的文档表示能力是任何成功的摘要程序的关键。学习这种表示形式成为自然语言处理领域（NLP）的一项非常活跃的研究。传统方法大多无法提供良好的代表性。单词嵌入已被证明在学习表示中表现出色。在本文中，使用带有词嵌入的改进BM25从词向量构建句子向量。整个文档表示为一组句子向量。然后，计算每对句子向量之间的相似度。之后，基于图的模型TextRank用于对句子进行排名。通过根据压缩率选择排名靠前的句子来生成摘要。使用两个著名的数据集DUC2002和DUC2004评估模型。实验结果表明，与最新方法相比，所提出的模型具有更好的综合性能。

著录项

来源
《International journal of software engineering and knowledge engineering》 |2019年第2期|177-196|共20页
作者
Al-Sabahi Kamal; Zhang Zuping;
展开▼
作者单位

Cent S Univ, Sch Informat Sci & Engn, Changsha 410083, Hunan, Peoples R China;

Cent S Univ, Sch Informat Sci & Engn, Changsha 410083, Hunan, Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Word embedding; sentence vector; Word2Vec; document summarization; cosine similarity; BM25;

机译：词嵌入;句子向量;Word2Vec;文档摘要;余弦相似度;BM25;
入库时间 2022-08-18 04:14:29

相似文献

外文文献
中文文献
专利

1. Intelligent multi-document summarization for biomedical literature by word embeddings and graph-based ranking [J] . Shen Chen, Lin Hongfei, Hao Huihui, Journal of intelligent & fuzzy systems: Applications in Engineering and Technology . 2019,第4aPta1期

机译：Word Embeddings和基于Graph级别的生物医学文献智能多文件摘要
2. Text document summarization using word embedding [J] . Mohd Mudasir, Jan Rafiya, Shah Muzaffar Expert Systems with Application . 2020,第Apra期

机译：使用单词嵌入的文本文档摘要
3. Automatic Multi-Document Summarization Based on Keyword Density and Sentence-Word Graphs [J] . YE Feiyue, XU Xinchen 上海交通大学学报（英文版） . 2018,第004期
4. Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization [C] . Dingding Wang, Tao Li, Shenghuo Zhu, Annual international ACM SIGIR conference on Research and development in information retrieval;International ACM SIGIR conference on Research and development in information retrieval . 2008

机译：通过句子级语义分析和对称矩阵分解实现多文档摘要
5. Multi-document summarization based on atomic semantic events and their temporal relations [D] . Uddin, Md Mohsin 2015

机译：基于原子语义事件及其时间关系的多文档摘要
6. Evaluating semantic relations in neural word embeddings with biomedical and general domain knowledge bases [O] . Zhiwei Chen, Zhe He, Xiuwen Liu, 2018

机译：利用生物医学和通用领域知识库评估神经词嵌入中的语义关系
7. Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization [O] . Dingding Wang, Tao Li, Shenghuo Zhu, 2014

机译：通过句子级语义分析和对称矩阵分解进行多文档摘要

Document Summarization Using Sentence-Level Semantic Based on Word Embeddings

摘要

著录项

相似文献

相关主题

期刊订阅