首页> 外文期刊>ACM Transactions on Information Systems >Exploiting Neighborhood Knowledge for Single Document Summarization and Keyphrase Extraction
【24h】

Exploiting Neighborhood Knowledge for Single Document Summarization and Keyphrase Extraction

机译:利用邻域知识进行单文档摘要和关键词提取

获取原文
获取原文并翻译 | 示例

摘要

Document summarization and keyphrase extraction are two related tasks in the IR and NLP fields, and both of them aim at extracting condensed representations from a single text document. Existing methods for single document summarization and keyphrase extraction usually make use of only the information contained in the specified document. This article proposes using a small number of nearest neighbor documents to improve document summarization and keyphrase extraction for the specified document, under the assumption that the neighbor documents could provide additional knowledge and more clues. The specified document is expanded to a small document set by adding a few neighbor documents close to the document, and the graph-based ranking algorithm is then applied on the expanded document set to make use of both the local information in the specified document and the global information in the neighbor documents. Experimental results on the Document Understanding Conference (DUC) benchmark datasets demonstrate the effectiveness and robustness of our proposed approaches. The cross-document sentence relationships in the expanded document set are validated to be beneficial to single document summarization, and the word cooccurrence relationships in the neighbor documents are validated to be very helpful to single document keyphrase extraction.
机译:文档摘要和关键短语提取是IR和NLP字段中的两个相关任务,它们都旨在从单个文本文档中提取压缩表示。用于单文档摘要和关键字短语提取的现有方法通常仅使用指定文档中包含的信息。本文建议在假定邻居文档可以提供更多知识和更多线索的前提下,使用少量最近邻居文档来改善指定文档的文档摘要和关键字短语提取。通过在文档附近添加一些相邻文档,将指定的文档扩展为小型文档集,然后将基于图的排名算法应用于扩展的文档集,以利用指定文档中的本地信息和邻居文档中的全局信息。在文档理解会议(DUC)基准数据集中的实验结果证明了我们提出的方法的有效性和鲁棒性。扩展文档集中的跨文档句子关系经过验证,有利于单文档摘要;相邻文档中的单词共现关系经验证对单文档关键词提取非常有帮助。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号