首页> 外文会议>Proceedings of the Eighth ACM SIGKDD international conference on knowledge discovery and data mining(KDD-2000) >Hierarchical model-based clustering of large datasets through fractionation and refractionation
【24h】

Hierarchical model-based clustering of large datasets through fractionation and refractionation

机译:通过分级和分级对大型数据集进行基于层次模型的聚类

获取原文

摘要

The goal of clustering is to identify distinct groups in a dataset. Compared to non-parametric clustering methods like complete linkage, hierarchical model-based clustering has the advantage of offering a way to estimate the number of groups present in the data. However, its computational cost is quadratic in the number of items to be clustered, and it is therefore not applicable to large problems. We review an idea called Fractionation, originally conceived by Cutting, Karger, Pedersen and Tukey for non-parametric hierarchical clustering of large datasets, and describe an adaptation of Fractionation to model-based clustering. A further extension, called Refractionation, leads to a procedure that can be successful even in the difficult situation where there are large numbers of small groups.
机译:聚类的目的是识别数据集中的不同组。与完全链接之类的非参数聚类方法相比,基于层次模型的聚类具有提供一种估计数据中存在的组数的方式的优势。但是,其计算成本在要聚类的项目数量上是平方的,因此不适用于较大的问题。我们回顾了最初由Cuting,Karger,Pedersen和Tukey提出的,用于大型数据集的非参数层次聚类的,称为“分数”的思想,并描述了分数对基于模型的聚类的适应性。进一步的扩展称为折射,即使在有大量小团体的困难情况下,该过程也可以成功。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号