首页> 外文会议>ACM SIGKDD international conference on knowledge discovery and data mining >Accelerating Exact k-means Algorithms with Geometric Reasoning
【24h】

Accelerating Exact k-means Algorithms with Geometric Reasoning

机译:利用几何推理加速精确的k均值算法

获取原文

摘要

We present new algorithms for the k-means clustering problem. They use a new kind of kd-tree traversal algorithm supplemented with a novel pruning test to give sublinear cost both in the number of datapoints and in the number of centers. The kd-trees are decorated with extra 'cached sufficient statistics' as in (3). Sufficient statistics are stored in the nodes of the kd-tree. Then, an analysis of the geometry of the current cluster centers results in great reduction of the work needed to update the centers. Our algorithms behave exactly as the traditional k-means algorithm. Proofs of correctness are included. The kd-tree can also be used to initialize the k-means starting centers efficiently. Our algorithms can be easily extended to provide fast ways of computing the error of a given cluster assignment, regardless of the method in which those clusters were obtained. We also show how to use them in a setting which allows approximate clustering results, with the benefit of running faster.
机译:我们提出了针对k均值聚类问题的新算法。他们使用一种新型的kd-tree遍历算法,并辅以新颖的修剪测试,以给出数据点数量和中心数量的次线性成本。如(3)所示,kd树装饰有额外的“缓存的足够统计量”。足够的统计信息存储在kd-tree的节点中。然后,对当前群集中心的几何形状进行分析可以大大减少更新中心所需的工作。我们的算法与传统的k均值算法完全一样。包括正确性证明。 kd树还可以用于有效地初始化k均值起始中心。我们的算法可以轻松扩展,以提供快速的方法来计算给定群集分配的误差,而与获得这些群集的方法无关。我们还将展示如何在允许近似聚类结果的设置中使用它们,并具有运行速度更快的好处。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号