首页> 外国专利> Topic structure extraction apparatus, topic structure extraction program, and computer-readable storage medium storing topic structure extraction program

Topic structure extraction apparatus, topic structure extraction program, and computer-readable storage medium storing topic structure extraction program

机译:主题结构提取设备,主题结构提取程序和存储主题结构提取程序的计算机可读存储介质

摘要

PPROBLEM TO BE SOLVED: To detect a plurality of topics, and to automatically generate inter-topic relations and the summary sentence of each topic. PSOLUTION: A text is divided into word units, and a concept base is retrieved so that vectors corresponding to those respective words can be acquired. Then, a text is divided into the groups of segments being the blocks of the same topic from the series of the word vectors, and each segment is considered as the group of the word vectors included in the segment, and the segment groups are hierarchically clustered from such a standard that the segments which are close to each other are the same cluster. As for the respective acquired clusters, a summary sentence characterizing each cluster is extracted from a text included in the cluster, and inter-cluster relations and the summary sentence of each cluster are outputted. PCOPYRIGHT: (C)2005,JPO&NCIPI
机译:

要解决的问题:检测多个主题,并自动生成主题间关系和每个主题的摘要语句。

解决方案:将文本分为单词单元,并检索概念库,以便可以获取与这些单词对应的向量。然后,将文本从一系列单词向量中划分为属于同一主题的块的片段组,并将每个片段视为该片段中包含的单词向量的组,并将片段组进行层次化聚类。从这样的标准来看,彼此接近的段是同一簇。对于所获取的各个群集,从包含在群集中的文本中提取表征每个群集的概要语句,并且输出群集间关系和每个群集的概要语句。

版权:(C)2005,JPO&NCIPI

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号