...
首页> 外文期刊>Computing and informatics >SemPCA-Summarizer: Exploiting Semantic Principal Component Analysis for Automatic Summary Generation
【24h】

SemPCA-Summarizer: Exploiting Semantic Principal Component Analysis for Automatic Summary Generation

机译:SemPCA-Summarizer:利用语义主成分分析来自动生成摘要

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Text summarization is the task of condensing a document keeping the relevant information. This task integrated in wider information systems can help users to access key information without having to read everything, allowing for a higher efficiency. In this research work, we have developed and evaluated a single-document extractive summarization approach, named SemPCA-Summarizer, which reduces the dimension of a document using Principal Component Analysis technique enriched with semantic information. A concept-sentence matrix is built from the textual input document, and then, PCA is used to identify and rank the relevant concepts, which are used for selecting the most important sentences through different heuristics, thus leading to various types of summaries. The results obtained show that the generated summaries are very competitive, both from a quantitative and a qualitative viewpoint, thus indicating that our proposed approach is appropriate for briefly providing key information, and thus helping to cope with a huge amount of information available in a quicker and efficient manner.
机译:文本摘要是压缩文档以保留相关信息的任务。集成在更广泛的信息系统中的这项任务可以帮助用户访问关键信息,而不必阅读所有内容,从而提高了效率。在这项研究工作中,我们已经开发并评估了一种名为SemPCA-Summarizer的单文档提取摘要方法,该方法使用富含语义信息的主成分分析技术来缩小文档的尺寸。从文本输入文档中构建概念句子矩阵,然后使用PCA识别和排序相关概念,这些概念用于通过不同的启发式方法选择最重要的句子,从而产生各种类型的摘要。所获得的结果表明,从定量和定性的角度来看,所生成的摘要都具有很高的竞争力,因此表明我们提出的方法适合于简要提供关键信息,从而有助于更快地应对大量可用信息。高效的方式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号