首页>
外国专利>
INFORMATION RETRIEVAL AND TEXT MINING USING DISTRIBUTED LATENT SEMANTIC INDEXING
INFORMATION RETRIEVAL AND TEXT MINING USING DISTRIBUTED LATENT SEMANTIC INDEXING
展开▼
机译:分布式隐式语义索引的信息检索与文本挖掘
展开▼
页面导航
摘要
著录项
相似文献
摘要
The use of latent semantic indexing (LSI) for information retrieval and textmining operations is adapted to work on large heterogeneous data sets by firstpartitioning the data set into a number of smaller partitions having similarconcept domains. A similarity graph network is generated in order to exposelinks between concept domains which are then exploited in determining whichdomains to query as well as in expanding the query vector. LSI is performed onthose partitioned data sets most likely to contain information related to theuser query or text mining operation. In this manner LSI can be applied todatasets that heretofore presented scalability problems. Additionally, thecomputation of the singular value decomposition of the term-by-document matrixcan be accomplished at various distributed computers increasing the robustnessof the retrieval and text mining system while decreasing search times.
展开▼