首页> 外文学位 >Generating coherent extracts of single documents using latent semantic analysis.
【24h】

Generating coherent extracts of single documents using latent semantic analysis.

机译:使用潜在语义分析生成单个文档的连贯摘要。

获取原文
获取原文并翻译 | 示例

摘要

A major problem with automatically-produced summaries in general, and extracts in particular, is that the output text often lacks textual coherence. Our goal is to improve the textual coherence of automatically produced extracts. We developed and implemented an algorithm which builds an initial extract composed solely of topic sentences, and then recursively fills in the lacunae by providing linking material from the original text between semantically dissimilar sentences. Our summarizer differs in architecture from most others in that it measures semantic similarity with latent semantic analysis (LSA), a factor analysis technique based on the vector-space model of information retrieval. We believed that the deep semantic relations discovered by LSA would assist in the identification and correction of abrupt topic shifts in the summaries. However, our experiments did not show a statistically significant difference in the coherence of summaries produced by our system as compared with a non-LSA version.
机译:通常,自动生成的摘要(尤其是摘要)的主要问题是输出文本通常缺乏文本连贯性。我们的目标是改善自动生成的摘录的文本连贯性。我们开发并实现了一种算法,该算法可构建仅由主题句子组成的初始提取,然后通过提供语义不同的句子之间的原始文本的链接材料来递归填充空白。我们的摘要程序在体系结构上与其他大多数程序有所不同,它使用潜在语义分析(LSA)来测量语义相似性,这是一种基于信息检索的矢量空间模型的因素分析技术。我们认为,LSA发现的深层语义关系将有助于摘要中突发主题转移的识别和纠正。但是,与非LSA版本相比,我们的实验并未显示出我们系统生成的摘要的一致性在统计学上有显着差异。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号