首页> 外文期刊>Information Processing & Management >Applying regression models to query-focused multi-document summarization
【24h】

Applying regression models to query-focused multi-document summarization

机译:将回归模型应用于以查询为中心的多文档摘要

获取原文
获取原文并翻译 | 示例

摘要

Most existing research on applying machine learning techniques to document summarization explores either classification models or learning-to-rank models. This paper presents our recent study on how to apply a different kind of learning models, namely regression models, to query-focused multi-document summarization. We choose to use Support Vector Regression (SVR) to estimate the importance of a sentence in a document set to be summarized through a set of pre-defined features. In order to learn the regression models, we propose several methods to construct the "pseudo" training data by assigning each sen tence with a "nearly true" importance score calculated with the human summaries that have been provided for the corresponding document set. A series of evaluations on the DUC data sets are conducted to examine the efficiency and the robustness of the proposed approaches. When compared with classification models and ranking models, regression models are consistently preferable.
机译:现有的大多数将机器学习技术应用于文档摘要的研究都探索了分类模型或按等级学习模型。本文介绍了我们最近关于如何将另一种学习模型(即回归模型)应用于以查询为重点的多文档摘要的研究。我们选择使用支持向量回归(SVR)来评估要通过一组预定义功能进行汇总的文档集中句子的重要性。为了学习回归模型,我们提出了几种方法来构造“伪”训练数据,方法是为每个句子分配“近乎真实”的重要性评分,该重要性评分是根据为相应文档集提供的人类摘要计算得出的。对DUC数据集进行了一系列评估,以检查所提出方法的效率和鲁棒性。与分类模型和排名模型相比,回归模型始终是可取的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号