首页>
外国专利>
Category based, extensible and interactive system for document retrieval
Category based, extensible and interactive system for document retrieval
展开▼
机译:基于类别的可扩展交互式文档检索系统
展开▼
页面导航
摘要
著录项
相似文献
摘要
In information retrieval (IR) systems with high-speed access, especially to search engines applied to the Internet and/or corporate intranet domains for retrieving accessible documents automatic text categorization techniques are used to support the presentation of search query results within high-speed network environments. ;An integrated, automatic and open information retrieval system (100) comprises an hybrid method based on linguistic and mathematical approaches for an automatic text categorization. It solves the problems of conventional systems by combining an automatic content recognition technique with a self-learning hierarchical scheme of indexed categories. In response to a word submitted by a requester, said system (100) retrieves documents containing that word, analyzes the documents to determine their word-pair patterns, matches the document patterns to database patterns that are related to topics, and thereby assigns topics to each document. If the retrieved documents are assigned to more than one topic, a list of the document topics is presented to the requester, and the requester designates the relevant topics. The requester is then granted access only to documents assigned to relevant topics. A knowledge database (1408) linking search terms to documents and documents to topics is established and maintained to speed future searches. Additionally, new strategies are presented to deal with different update frequencies of changed Web sites.
展开▼