Extractive multi-document summarization using population-based multicriteria optimization

John Ansamma; Premjith P. S.; Wilscy M.

首页> 外文期刊>Expert Systems with Application >Extractive multi-document summarization using population-based multicriteria optimization

【24h】

Extractive multi-document summarization using population-based multicriteria optimization

机译：使用基于群体的多准则优化进行提取式多文档摘要

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Multi-document summarization is the process of extracting salient information from a set of source texts and present that information to the user in a condensed form. In this paper, we propose a multi document summarization system which generates an extractive generic summary with maximum relevance and minimum redundancy by representing each sentence of the input document as a vector of words in Proper Noun, Noun, Verb and Adjective set. Five features, such as TF_ISF, Aggregate Cross Sentence Similarity, Title Similarity, Proper Noun and Sentence Length associated with the sentences, are extracted, and scores are assigned to sentences based on these features. Weights that can be assigned to different features may vary depending upon the nature of the document, and it is hard to discover the most appropriate weight for each feature, and this makes generation of a good summary a very tough task without human intelligence. Multi-document summarization problem is having large number of decision parameters and number of possible solutions from which most optimal summary is to be generated. Summary generated may not guarantee the essential quality and may be far from the ideal human generated summary. To address this issue, we propose a population-based multicriteria optimization method with multiple objective functions. Three objective functions are selected to determine an optimal summary, with maximum relevance, diversity, and novelty, from a global population of summaries by considering both the statistical and semantic aspects of the documents. Semantic aspects are considered by Latent Semantic Analysis (LSA) and Non Negative Matrix Factorization (NMF) techniques. Experiments have been performed on DUC 2002, DUC 2004 and DUC 2006 datasets using ROUGE tool kit. Experimental results show that our system outperforms the state of the art works in terms of Recall and Precision. (C) 2017 Elsevier Ltd. All rights reserved.

机译：多文档摘要是从一组源文本中提取显着信息并将其以简明形式呈现给用户的过程。在本文中，我们提出了一种多文档摘要系统，该系统通过将输入文档的每个句子表示为专有名词，名词，动词和形容词集中的单词向量来生成具有最大相关性和最小冗余的提取性摘要。提取与句子相关的五个特征（例如TF_ISF，聚合交叉句子相似度，标题相似度，专有名词和句子长度），并根据这些特征将分数分配给句子。可以分配给不同功能的权重可能因文档的性质而异，并且很难为每个功能找到最合适的权重，这使得生成良好的摘要成为一项没有人工智慧的艰巨任务。多文档摘要问题具有大量的决策参数和可能的解决方案，从中可以生成最优化的摘要。生成的摘要可能无法保证基本质量，并且可能与理想的人工生成的摘要相去甚远。为了解决这个问题，我们提出了一种基于人口的具有多个目标函数的多准则优化方法。通过考虑文档的统计和语义方面，从全局的摘要中选择了三个目标函数来确定具有最大相关性，多样性和新颖性的最佳摘要。潜在语义分析（LSA）和非负矩阵分解（NMF）技术考虑了语义方面。使用ROUGE工具套件对DUC 2002，DUC 2004和DUC 2006数据集进行了实验。实验结果表明，我们的系统在查全率和精确度方面都优于最新技术。（C）2017 Elsevier Ltd.保留所有权利。

著录项

来源
《Expert Systems with Application》 |2017年第11期|385-397|共13页
作者
John Ansamma; Premjith P. S.; Wilscy M.;
展开▼
作者单位

Univ Kerala, Dept Comp Sci, Kariyavattom, Kerala, India;

TKM Coll Engn, Dept Comp Sci, Kollam, Kerala, India;

Univ Kerala, Dept Comp Sci, Kariyavattom, Kerala, India;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Multi-document summarization; Multicriteria optimization; Latent semantic analysis; Non negative matrix factorization; DUC; ROUGE;

机译：多文档摘要;多准则优化;潜在语义分析;非负矩阵分解;DUC;ROUGE;
入库时间 2022-08-17 13:29:12

相似文献

外文文献
中文文献
专利

1. Extractive multi-document text summarization using dolphin swarm optimization approach [J] . Srivastava Atul Kumar, Pandey Dhiraj, Agarwal Alok Multimedia Tools and Applications . 2021,第7期

机译：使用海豚群优化方法提取多文件文本摘要
2. A decomposition-based multi-objective optimization approach for extractive multi-document text summarization [J] . Applied Soft Computing . 2020,第期

机译：一种基于分解的多文件文本摘要的多目标优化方法
3. Parallelizing a multi-objective optimization approach for extractive multi-document text summarization [J] . Sanchez-Gomez Jesus M., Vega-Rodriguez Miguel A., Perez Carlos J. Journal of Parallel and Distributed Computing . 2019,第Deca期

机译：用于提取多文档文本摘要的并行多目标优化方法
4. Optimizing an Approximation of ROUGE - a Problem-Reduction Approach to Extractive Multi-Document Summarization [C] . Maxime Peyrard, Judith Eckle-Kohler Annual meeting of the Association for Computational Linguistics . 2016

机译：优化ROUGE近似值-一种减少问题的提取多文档摘要的方法
5. Multi-document Summarization Based on Document Clustering and Neural Sentence Fusion [D] . Fuad, Tanvir Ahmed. 2018

机译：基于文档聚类和神经句子融合的多文件摘要
6. Extractive single document summarization using binary differential evolution: Optimization of different sentence quality measures [O] . Naveen Saini, Sriparna Saha, Dhiraj Chakraborty, 2019

机译：采用二元差分演进的提取单一文件摘要：不同句子质量措施的优化
7. A Framework for Multi-document Extractive Summarization of Reviews with Aspect-based Sentiment Analysis [O] . André Oliveira, Anna Costa, Eduardo Hruschka 2020

机译：基于方面的情感分析评定的多文件提取综述框架

Extractive multi-document summarization using population-based multicriteria optimization

摘要

著录项

相似文献

相关主题

期刊订阅