首页>
外国专利>
Calculating criticality of the terminology which n regarding specified scientific field the important language selection manner in the document
Calculating criticality of the terminology which n regarding specified scientific field the important language selection manner in the document
PROBLEM TO BE SOLVED: To precisely specify keywords characterizing each document, and to grasp the contents of each document at a glace in a document database in which a plurality of documents related with a specific field are summarized.;SOLUTION: This keyword extraction method in a document database is provided for making a programmed computer execute a step for acquiring the whole number m of terms included in a document database in which n pieces of documents related with a specific field are summarized and the respective terms Tj(j=1, 2, 3, ..., m), and for managing the identification of the respective terms Tj, a step for calculating appearance frequency Wij related with the terms Ti in a document Di by a predetermined calculation formula, a step for calculating distribution S2j of the appearance frequency Wij value concerning the terms Tj, a step for calculating significance Vij of the terms Tj in the document Di by Vij=Uij×S2j by using the appearance frequency of the terms Tj in the document Di as Uij and a step for preparing and outputting a term list in which the terms Tj are listed up based on the Vij.;COPYRIGHT: (C)2006,JPO&NCIPI
展开▼
机译:解决的问题:精确指定每个文档的特征关键字,并在文档数据库中一目了然地掌握每个文档的内容,该文档数据库中汇总了与特定领域相关的多个文档。提供了一个文档数据库,用于使编程计算机执行一个步骤,该步骤用于获取文档数据库中包括的术语m的总数,其中汇总了与特定字段有关的n个文档,并且各个术语T j Sub>( j Sub> = 1,2,3,...,m),并且为了管理相应项T j Sub>的标识,计算出现频率的步骤W ij Sub>通过预定的计算公式与文档D i Sub>中的术语T i Sub>相关,这是计算分布S 2的步骤与项T j Sub>有关的出现频率W ij Sub>值的 Sup> j Sub> V ij Sub> = U ij中文件D i Sub>中术语T j Sub>的V ij Sub> Sub>× S 2 Sup> j Sub>,方法是使用文档D i T>中术语T j Sub>的出现频率 Sub> U ij Sub>,以及准备和输出术语列表的步骤,其中基于V ij Sub>列出了术语T j Sub> 。;版权:(C)2006,JPO&NCIPI
展开▼