PROBLEM TO BE SOLVED: To solve problems wherein an anchor character string is not necessarily a description which explains the contents of a document completely even if considering the anchor character string of the link origin of the document as the object of retrieval/classification, and further narrowing-down retrieval cannot be performed with sufficient accuracy.;SOLUTION: A document cluster information acquiring means 12 extracts link information from the given document, generates a document reference relation table, then determines whether the given document starts from the top page, and registers in a document cluster table according to the determined result. A document keyword determining means 14 refers to the document reference relation table and the document cluster table to set the anchor character string of the link stretched from the outside of a site, as a site outside keyword and to set a series of anchor character string obtained going back to the link of the document in the same cluster as a site inside keyword on the document in each cluster, and stores them respectively in a document keyword storage part 22.;COPYRIGHT: (C)2004,JPO
展开▼