首页> 外文期刊>SIGKDD explorations >Scalable Hierarchical Clustering with Tree Grafting
【24h】

Scalable Hierarchical Clustering with Tree Grafting

机译:可扩展的分层聚类与树嫁接

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

We introduce Grinch, a new algorithm for large-scale, non-greedy hierarchical clustering with general linkage functions that compute arbitrary similarity between two point sets. The key components of Grinch are its rotate and graft subroutines that efficiently reconfigure the hierarchy as new points arrive, supporting discovery of clusters with complex structure. Grinch is motivated by a new notion of separability for clustering with linkage functions: we prove that when the linkage function is consistent with a ground-truth clustering, Grinch is guaranteed to produce a cluster tree containing the ground-truth, independent of data arrival order. Our empirical results on benchmark and author coreference datasets (with standard and learned linkage functions) show that Grinch is more accurate than other scalable methods, and orders of magnitude faster than hierarchical agglomerative clustering.
机译:我们引入GRINCH,一种新的大规模非贪婪分层聚类的新算法,具有一般链接函数,可以在两点集之间计算任意相似性。 Grinch的关键部件是其旋转和移植子程序,可有效地重新配置层次结构作为新点到达,支持具有复杂结构的集群发现。 Grinch通过与联动功能的聚类可分离性的新概念产生了激励:我们证明了当链接功能与地面真实聚类一致时,RERINCH保证生成包含地面真理的集群树,独立于数据到达订单 。 我们对基准和作者Coreference数据集的实证结果(具有标准和学习的链接功能)显示Grinch比其他可扩展方法更准确,并且比分层附下聚类更快的数量级。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号