首页> 外文期刊>Information Processing & Management >FoDoSu: Multi-document summarization exploiting semantic analysis based on social Folksonomy
【24h】

FoDoSu: Multi-document summarization exploiting semantic analysis based on social Folksonomy

机译:FoDoSu:基于社会Folksonomy的多文档摘要利用语义分析

获取原文
获取原文并翻译 | 示例
       

摘要

Multi-document summarization techniques aim to reduce documents into a small set of words or paragraphs that convey the main meaning of the original document. Many approaches to multi-document summarization have used probability-based methods and machine learning techniques to simultaneously summarize multiple documents sharing a common topic. However, these techniques fail to semantically analyze proper nouns and newly-coined words because most depend on an out-of-date dictionary or thesaurus. To overcome these drawbacks, we propose a novel multi-document summarization system called FoDoSu, or Folksonomy-based Multi-Document Summarization, that employs the tag clusters used by Flickr, a Folksonomy system, for detecting key sentences from multiple documents. We first create a word frequency table for analyzing the semantics and contributions of words using the HITS algorithm. Then, by exploiting tag clusters, we analyze the semantic relationships between words in the word frequency table. Finally, we create a summary of multiple documents by analyzing the importance of each word and its semantic relatedness to others. Experimental results from the TAC 2008 and 2009 data sets demonstrate the improvement of our proposed framework over existing summarization systems.
机译:多文档摘要技术旨在将文档简化为传达原始文档主要含义的一小组单词或段落。多文档摘要的许多方法已使用基于概率的方法和机器学习技术来同时汇总共享同一主题的多个文档。但是,由于大多数技术依赖于过时的词典或同义词库,因此这些技术无法在语义上分析专有名词和新产生的单词。为了克服这些缺点,我们提出了一种新颖的多文档摘要系统,称为FoDoSu或基于Folksonomy的Multi-Document Summarization,该系统使用Flickr(一种Folksonomy系统)所使用的标签簇来检测多个文档中的关键句子。我们首先创建一个词频表,以使用HITS算法分析词的语义和贡献。然后,通过利用标签簇,我们分析了词频表中词之间的语义关系。最后,我们通过分析每个单词的重要性及其与其他单词的语义相关性来创建多个文档的摘要。 TAC 2008和2009数据集的实验结果表明,与现有的摘要系统相比,我们提出的框架有所改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号