【24h】

Fast Nonparametric Density-Based Clustering of Large Datasets Using a Stochastic Approximation Mean-Shift Algorithm

机译:基于随机近似均值漂移算法的大型数据集基于非参数密度的快速聚类

获取原文
获取原文并翻译 | 示例
           

摘要

Mean-shift is an iterative procedure often used as a nonparametric clustering algorithm that defines clusters based on the modal regions of a density function. The algorithm is conceptually appealing and makes assumptions neither about the shape of the clusters nor about their number. However, with a complexity of O(n(2)) per iteration, it does not scale well to large datasets. We propose a novel algorithm which performs density-based clustering much quicker than mean shift, yet delivering virtually identical results. This algorithm combines subsampling and a stochastic approximation procedure to achieve a potential complexity of O(n) at each step. Its convergence is established. Its performances are evaluated using simulations and applications to image segmentation, where the algorithm was tens or hundreds of times faster than mean shift, yet causing negligible amounts of clustering errors. The algorithm can be combined with existing approaches to further accelerate clustering.
机译:均值平移是一种迭代过程,通常用作基于密度函数模态区域定义聚类的非参数聚类算法。该算法在概念上很吸引人,并且既不对群集的形状也不对群集的数量进行假设。但是,由于每次迭代的复杂度为O(n(2)),因此无法很好地扩展到大型数据集。我们提出了一种新颖的算法,该算法比基于均值平移的方法执行基于密度的聚类要快得多,而实际上却提供了相同的结果。该算法将子采样和随机逼近过程结合在一起,以在每个步骤上实现O(n)的潜在复杂性。建立其融合。使用模拟和图像分割应用评估了它的性能,该算法比平均漂移速度快几十或几百倍,但造成的聚类误差可忽略不计。该算法可以与现有方法结合以进一步加速聚类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号