首页>
外国专利>
EXTREMELY SIMILAR DOCUEMTN EXTRACTION METHOD
EXTREMELY SIMILAR DOCUEMTN EXTRACTION METHOD
展开▼
机译:极其相似的文档提取方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
PROBLEM TO BE SOLVED: To accurately extract a document extremely similar to a certain document and to extract it with less noise. ;SOLUTION: The document input processing 2 of a new document 1 is performed and a word appearance pattern extraction processing 3 such as the word extraction of a specified speech part, unrequited word elimination and the approval of a word appearance order, etc., is performed by using dictionaries 11 and 12. An extremely similar document decision processing 4 for generating a word information table 13, collating it with a DB information table 14 obtained by executing the processing 3 to all the documents inside a DB, extracting words appearing in common and the string of the words for which the appearance order of the respective words is the same for every document unit, adding a value for which weight is added to the number of the words appearing in common and the value of a monotonous increase function whose variable is the number of the words constituting the string of the words, calculating the degree of extreme similarity for respective sentence units and approving the extremely similar document in the case that the sentence unit provided with the degree of the extreme similarity higher than a certain threshold value continues more than a certain length is performed. The result is displayed 5 and registration judgement 6 is performed.;COPYRIGHT: (C)1997,JPO
展开▼