【24h】

Optimal Bandwidth Selection for Density-Based Clustering

机译:基于密度的聚类的最佳带宽选择

获取原文

摘要

Cluster analysis has long played an important role in a wide variety of data applications. When the clusters are irregular or intertwined, density-based clustering is proved to be much more efficient. The quality of clustering result depends on an adequate choice of the parameters. However, without enough domain knowledge the parameter setting is somewhat limited in its operability. In this paper, a new method is proposed to automatically find out the optimal parameter value of the bandwidth. It is to infer the most suitable parameter value by the constructed model on parameter estimation. Based on the Bayesian Theorem, from which the most probability value for the bandwidth can be acquired in accordance with the inherent distribution characteristics of the original data set. Clusters can then be identified by the determined parameter values. The results of the experiment show that the proposed method has complementary advantages in the density-based clustering algorithm.
机译:长期以来,聚类分析在各种数据应用程序中发挥了重要作用。当聚类不规则或交织在一起时,基于密度的聚类被证明效率更高。聚类结果的质量取决于参数的适当选择。但是,在没有足够的领域知识的情况下,参数设置的可操作性受到一定限制。本文提出了一种新的方法来自动找出带宽的最佳参数值。通过参数估计的构造模型推断出最合适的参数值。基于贝叶斯定理,可以根据原始数据集的固有分布特性从中获取带宽的最大概率值。然后可以通过确定的参数值来识别群集。实验结果表明,该方法在基于密度的聚类算法中具有互补优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号