首页> 外文会议>International Conference on Measuring Technology and Mechatronics Automation >Application Research on Latent Semantic Analysis for Information Retrieval
【24h】

Application Research on Latent Semantic Analysis for Information Retrieval

机译:潜在语义分析在信息检索中的应用研究

获取原文

摘要

The basic principle of Classic traditional information retrieval model is the machine matching of the key word, namely retrieval based on keywords. This paper proposes a pre-clustering-based latent semantic analysis algorithm for document retrieval. The algorithm can solve the problem of time consuming computation of the similarity between the query vector and each text vector in the traditional latent semantic algorithm for document retrieval. It first clusters the documents using k-means clustering based on the latent semantic analysis, finds out the central point of each cluster, and then calculates the similarity between the query vector and each cluster's central points for retrieval. In view of the characteristics of document retrieval, it proposes a new method for calculating the feature weights and adopts the method of pre-clustering to preprocess document collection. The results of the experiment show that the new algorithm can reduce the search time, and improve the retrieval efficiency.
机译:经典传统信息检索模型的基本原理是关键词的机器匹配,即基于关键词的检索。提出了一种基于聚类的潜在语义分析算法,用于文档检索。该算法可以解决传统的潜在语义检索文档算法中查询向量与每个文本向量之间相似度计算耗时的问题。它首先基于潜在语义分析,使用k均值聚类对文档进行聚类,找出每个聚类的中心点,然后计算查询向量与每个聚类的中心点之间的相似度以进行检索。针对文献检索的特点,提出了一种计算特征权重的新方法,并采用了预聚类的方法对文献进行预处理。实验结果表明,新算法可以减少搜索时间,提高检索效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号