Extractive multi-document summarization based on textual entailment and sentence compression via knapsack problem

Naserasadi Ali; Khosravi Hamid; Sadeghi Faramarz

首页> 外文期刊>Natural language engineering >Extractive multi-document summarization based on textual entailment and sentence compression via knapsack problem

【24h】

Extractive multi-document summarization based on textual entailment and sentence compression via knapsack problem

机译：基于背包问题的文本蕴涵和句子压缩的提取式多文档摘要

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

By increasing the amount of data in computer networks, searching and finding suitable information will be harder for users. One of the most widespread forms of information on such networks are textual documents. So exploring these documents to get information about their content is difficult and sometimes impossible. Multi-document text summarization systems are an aid to producing a summary with a fixed and predefined length, while covering the maximum content of the input documents. This paper presents a novel method for multi-document extractive summarization based on textual entailment relations and sentence compression via formulating the problem as a knapsack problem. In this approach, sentences of documents are ranked according to the extended Tf-Idf method, then entailment scores of selected sentences are computed. Through these scores, the final score of each sentence is calculated. Finally, by decreasing the lengths of sentences via sentence compression, the problem has been solved by greedy and dynamic Programming approaches to the knapsack problem. Experiments on standard summarization datasets and evaluating the results based on the Rouge system show that the suggested method, according to the best of our knowledge, has increased F-measure of query-based summarization systems by two per cent and F-measure of general summarization systems by five per cent.

机译：通过增加计算机网络中的数据量，对于用户来说搜索和找到合适的信息将变得更加困难。在此类网络上，最广泛的信息形式之一是文本文档。因此，探索这些文档以获取有关其内容的信息非常困难，有时甚至是不可能的。多文档文本摘要系统有助于生成具有固定和预定义长度的摘要，同时覆盖输入文档的最大内容。通过将问题表述为背包问题，提出了一种基于文本包含关系和句子压缩的多文档提取摘要方法。在这种方法中，根据扩展的Tf-Idf方法对文档的句子进行排名，然后计算所选句子的包含分数。通过这些分数，可以计算出每个句子的最终分数。最后，通过句子压缩减少句子的长度，该问题已通过贪婪和动态规划方法解决了背包问题。在标准摘要数据集上进行的实验以及基于Rouge系统的评估结果表明，根据我们的知识，所建议的方法将基于查询的摘要系统的F度量提高了2％，将一般摘要的F度量提高了系统降低了5％。

著录项

来源
《Natural language engineering》 |2019年第1期|121-146|共26页
作者
Naserasadi Ali; Khosravi Hamid; Sadeghi Faramarz;
展开▼
作者单位

Shahid Bahonar Univ Kerman, Fac Math & Comp, Dept Appl Math, Kerman, Iran;

Shahid Bahonar Univ Kerman, Fac Math & Comp, Dept Comp Sci, Kerman, Iran;

Shahid Bahonar Univ Kerman, Fac Math & Comp, Dept Comp Sci, Kerman, Iran;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. MSCSO: Extractive Multi-document Summarization Based on a New Criterion of Sentences Overlapping [J] . Khaleghi Zeynab, Fakhredanesh Mohammad, Hourali Maryam Iranian Journal of Science and Technology, Transactions of Electrical Engineering . 2021,第1期

机译：MSCSO：基于句子重叠的新标准的提取多文件摘要
2. An unsupervised method for extractive multi-document summarization based on centroid approach and sentence embeddings [J] . Lamsiyah Salima, El Mahdaouy Abdelkader, Espinasse Bernard, Expert systems with applications . 2021,第Apra期

机译：基于质心方法和句子嵌入的提取多文件摘要的无监督方法
3. Single-document and multi-document summarization techniques for email threads using sentence compression [J] . David M. Zajic, Bonnie J. Dorr, Jimmy Lin Information Processing & Management . 2008,第4期

机译：使用句子压缩的电子邮件线程的单文档和多文档摘要技术
4. Multiple Alternative Sentence Compressions and Word-Pair Antonymy for Automatic Text Summarization and Recognizing Textual Entailment [C] . Saif Mohammad, Bonnie J. Dorr, Melissa Egan, Text Analysis Conference . 2011

机译：用于自动文本摘要和识别文本征兆的多个替代句子按压和单词对反义
5. Multi-document Summarization Based on Document Clustering and Neural Sentence Fusion [D] . Fuad, Tanvir Ahmed. 2018

机译：基于文档聚类和神经句子融合的多文件摘要
6. Extractive single document summarization using binary differential evolution: Optimization of different sentence quality measures [O] . Naveen Saini, Sriparna Saha, Dhiraj Chakraborty, 2019

机译：采用二元差分演进的提取单一文件摘要：不同句子质量措施的优化
7. Improving Multi-documents Summarization by Sentence Compression based on Expanded Constituent Parse Trees [O] . Chen Li, Yang Liu, Fei Liu, 2015

机译：基于扩展组分解析树的句子压缩改进多文档摘要

Extractive multi-document summarization based on textual entailment and sentence compression via knapsack problem

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅