首页> 外国专利> SNIPPET GENERATION DEVICE, SNIPPET GENERATION METHOD AND SNIPPET GENERATION PROGRAM

SNIPPET GENERATION DEVICE, SNIPPET GENERATION METHOD AND SNIPPET GENERATION PROGRAM

机译:片段生成装置,片段生成方法和片段生成程序

摘要

PROBLEM TO BE SOLVED: To properly generate a snippet from a structured document on the basis of a retrieval query.;SOLUTION: A snippet generation device 1 is configured so that: a DOM tree construction unit 11 performs syntactic analysis of a structured document, for expanding respective nodes forming the document into a tree structure, for extracting nodes of a title and a content of the document from the structure. A cluster generation unit 12 performs clustering of the respective nodes, based on similarity of respective nodes, of the tree structure. A score application unit 13 applies a score to the respective cluster, based on a word of a retrieval query, a related word, and a unique expression of the word, the retrieval query is included in the cluster generated by the clustering. a snippet generation unit 14 selects clusters having top rank score, in which length of the generated snippet is equal to or less than a threshold, as candidates of an element of the snippet, and arranges again the selected clusters in an order of appearance of the structured document, for generating as the snippet.;SELECTED DRAWING: Figure 1;COPYRIGHT: (C)2016,JPO&INPIT
机译:解决的问题:基于检索查询从结构化文档适当地生成片段;解决方案:片段生成设备1被配置为:DOM树构造单元11对结构化文档进行句法分析,以用于:将构成文档的各个节点扩展为树结构,以从该结构中提取标题和文档内容的节点。簇生成单元12基于树结构的各个节点的相似性来对各个节点进行聚类。得分应用单元13基于检索查询的词,相关词以及该词的唯一表达,将得分应用于各个聚类,该检索查询包括在由聚类生成的聚类中。片段生成单元14选择具有最高得分的簇,其中所生成的片段的长度等于或小于阈值,作为该片段的元素的候选,并以出现的顺序再次排列所选择的簇。结构化文档,作为摘要生成。;选定的图纸:图1;版权:(C)2016,JPO&INPIT

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号