首页> 外文会议>Proceedings of the 1995 ACM/IEEE supercomputing conference >Computational Methods for Intelligent Information Access
【24h】

Computational Methods for Intelligent Information Access

机译:智能信息访问的计算方法

获取原文
获取外文期刊封面目录资料

摘要

Currently, most approaches to retrieving textual materials from scientific databases depend on a lexical match between words in users' requests and those in or assigned to documents in a database. Because of the tremendous diversity in the words people use to describe the same document, lexical methods are necessarily incomplete and imprecise. Using the singular value decomposition (SVD), one can take advantage of the implicit higher-order structure in the association of terms with documents by determining the SVD of large sparse term by document matrices. Terms and documents represented by 200-300 of the largest singular vectors are then matched against user queries. We call this retrieval method Latent Semantic Indexing (LSI) because the subspace represents important associative relationships between terms and documents that are not evident in individual documents. LSI is a completely automatic yet intelligent indexing method, widely applicable, and a promising way to improve users' access to many kinds of textual materials, or to documents and services for which textual descriptions are available. A survey of the computational requirements for managing LSI-encoded databases as well as current and future applications of LSI is presented.
机译:当前,从科学数据库检索文本资料的大多数方法都取决于用户请求中的单词与数据库中文档中或分配给文档中的单词之间的词汇匹配。由于人们用来描述同一文档的词语差异很大,因此词汇方法必定是不完整且不精确的。使用奇异值分解(SVD),可以通过按文档矩阵确定较大的稀疏项的SVD,从而在术语与文档的关联中利用隐式的高阶结构。然后将200-300个最大奇异向量表示的术语和文档与用户查询进行匹配。我们称这种检索方法为潜在语义索引(LSI),因为子空间表示术语和文档之间的重要关联关系,而这些关联关系在单个文档中并不明显。 LSI是一种全自动但智能的索引方法,具有广泛的适用性,并且是改善用户对多种文本材料或文本描述可用的文档和服务的访问的一种有前途的方式。概述了管理LSI编码数据库的计算要求以及LSI当前和将来的应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号