首页> 外文会议>International Conference on Advances in Computing >Implementation of Web Search Result Clustering System
【24h】

Implementation of Web Search Result Clustering System

机译:实现Web搜索结果集群系统

获取原文

摘要

Web search results clustering is an increasingly popular technique for providing useful grouping of web search results. This paper introduces a prototype web search results clustering engine that use the random sampling technique with medoids instead of centroids to improve clustering quality, Cluster labeling is achieved by combining intra-cluster and inter-cluster term extraction based on a variant of the information gain measure by using Modified Furthest Point First algorithm. M-FPF is compared against two other established web document clustering algorithms: Suffix Tree Clustering (STC) and Lingo, which are provided by the free open source Carrot2 Document Clustering Workbench. We measure cluster quality by considering precision, recall and relevance. Results from testing on different datasets show a considerable clustering quality.
机译:Web搜索结果群集是一种越来越流行的技术,用于提供有用的Web搜索结果分组。本文介绍了一种原型Web搜索结果聚类引擎,使用随机采样技术与METOIDS而不是质心来提高聚类质量,通过基于信息增益测量的变体组合群集内和群集间术语提取来实现群集标签通过使用修改的最远点第一算法。将M-FPF与另外两个已建立的Web文档聚类算法进行比较:后缀树群集(STC)和Lingo,由自由开源Carrot2文档集群群集工作台提供。我们通过考虑精度,召回和相关性来测量群集质量。在不同数据集上测试的结果显示了相当大的聚类质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号