首页> 中文期刊>情报学报 >基于关键词共现分析的检索结果聚类研究

基于关键词共现分析的检索结果聚类研究

     

摘要

随着互联网规模的急剧扩张,提升信息检索的效用变得相当困难.本文首先通过特定算法提取每篇文档的关键词,然后运用统计方法计量不同文档的共现关键词并形成相应的共现关键词标签矩阵,最后利用层次聚类算法对共现关键词标签进行聚类并形成相应的层次标签树来构造文档聚类束.该方法可以对源搜索引擎返回的结果进行有效的分类,使用户在更高主题层次上查看检索词的相关信息,准确地找到感兴趣的信息.通过与Lingo算法的比较,显示本文算法所得的标签更具可读性和概括性,同时F-measure评价指标也表明本算法在文本聚类的质量上有了一定的提升.%The continuous growth in the size of the Internet is creating difficulties for improving efficiency of information retrieval. First of all, this paper extracts the keywords from each document through a specific algorithm.Secondly, it has applied statistical techniques to measure the quantities of co-occurrence keywords for forming the label matrix of them, and finally agglomerated them into higher-level clusters by hierarchical clustering algorithm in order to classify the results which return from the source research engine. The view of retrieval results clustering can help the user quickly and efficiently navigate the results of a query at a topic level and locate the relevant information. Compared with Lingo, the experimental results show that the labels generated by our algorithm are of more readability and generality.What' s more, F-measure index also shows that our algorithm has improved the quality of text clustering to some extent.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号