首页> 外文会议>Annual German Conference on Artificial Intelligence >Clustering Objects from Multiple Collections
【24h】

Clustering Objects from Multiple Collections

机译:来自多个集合的聚类对象

获取原文

摘要

Clustering methods cluster objects on the basis of a similarity measure between the objects. In clustering tasks where the objects come from more than one collection often part of the similarity results from features that are related to the collections rather than features that are relevant for the clustering task. For example, when clustering pages from various web sites by topic, pages from the same web site often contain similar terms. The collection-related part of the similarity hinders clustering as it causes the creation of clusters that correspond to collections instead of topics. In this paper we present two methods to restrict clustering to the part of the similarity that is not associated with membership of a collection. Both methods can be used on top of standard clustering methods. Experiments on data sets with objects from multiple collections show that our methods result in better clusters than methods that do not take collection information into account.
机译:群集方法基于对象之间的相似性度量来群集对象。在群集任务中,对象来自多个集合的常见情况,通常是与集合相关的功能的相似性,而不是与群集任务相关的功能。例如,当按主题的各种网站的聚类页面时,来自同一网站的页面通常包含类似的术语。相似性的集合相关部分阻碍群集,因为它导致创建与集合而不是主题相对应的群集。在本文中,我们提出了两种方法来限制群集到与收集成员资格无关的相似性的部分。两种方法都可以在标准聚类方法的顶部使用。来自多个集合的对象的数据集的实验表明我们的方法导致比不考虑收集信息的方法更好的群集。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号