首页> 外文会议>ASIST Annual Meeting >An Approach to Document Clustering Based on System Relevance
【24h】

An Approach to Document Clustering Based on System Relevance

机译:基于系统相关性的文档聚类方法

获取原文

摘要

Search engines fail to make a clear distinction between items of varying relevance when presenting search results to users. Instead, they rely on the user of the system to estimate which items are relevant, partially relevant, or not relevant. The user of the system is given the tedious task of distinguishing between documents that are relevant to different degrees. This often hinders the accessibility of relevant or partially relevant documents, particularly when the results set is large and many non-relevant documents are scattered throughout the set. In this paper, we present the results of a clustering scheme that groups documents within relevant, partially relevant, and not relevant clusters for a given search. A ranking algorithm accomplishes the task of clustering the documents based on system relevance. Data was collected from end-users issuing categorical, interval, and descriptive relevance judgments. The degree of overlap between users and the system for each of the clustered regions was measured. This research showed that clustering documents on the Web by regions of relevance is quite feasible.
机译:在向用户展示搜索结果时,搜索引擎未能在不同相关性的项目之间进行清晰区分。相反,它们依赖于系统的用户来估计哪些项目是相关的,部分相关或不相关。系统的用户被赋予区分与不同程度相关的文档的繁琐任务。这通常会阻碍相关或部分相关文件的可访问性,特别是当结果集很大并且许多非相关文件在整个集合中都分散。在本文中,我们介绍了集群计划的结果,该方案为特定搜查组分组相关,部分相关性,不相关群集的文档。排名算法根据系统相关性完成群集文档的任务。从最终用户发出分类,间隔和描述性相关判断的数据收集数据。测量用户与每个聚类区域中的系统之间的重叠程度。这项研究表明,相关性地区网上的聚类文件是非常可行的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号