首页> 外文会议>International Conference on Measuring Technology and Mechatronics Automation >Application Research on Latent Semantic Analysis for Information Retrieval
【24h】

Application Research on Latent Semantic Analysis for Information Retrieval

机译:信息检索潜在语义分析的应用研究

获取原文

摘要

The basic principle of Classic traditional information retrieval model is the machine matching of the key word, namely retrieval based on keywords. This paper proposes a pre-clustering-based latent semantic analysis algorithm for document retrieval. The algorithm can solve the problem of time consuming computation of the similarity between the query vector and each text vector in the traditional latent semantic algorithm for document retrieval. It first clusters the documents using k-means clustering based on the latent semantic analysis, finds out the central point of each cluster, and then calculates the similarity between the query vector and each cluster's central points for retrieval. In view of the characteristics of document retrieval, it proposes a new method for calculating the feature weights and adopts the method of pre-clustering to preprocess document collection. The results of the experiment show that the new algorithm can reduce the search time, and improve the retrieval efficiency.
机译:经典传统信息检索模型的基本原理是关键词的机器匹配,即基于关键字检索。本文提出了一种基于预聚类的潜在语义分析算法,用于文档检索。该算法可以解决文档检索中传统潜在语义算法中查询向量与每个文本向量之间的相似性计算的耗时的问题。首先使用基于潜在语义分析的K-means群集委托的文档,找到每个群集的中心点,然后计算查询向量和每个群集的中央点之间的相似性进行检索。鉴于文档检索的特征,提出了一种用于计算特征权重的新方法,并采用预处理预处理文件集合的方法。实验结果表明,新算法可以降低搜索时间,提高检索效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号