首页> 中文期刊> 《计算机工程与应用》 >自动确定聚类中心的密度峰值算法

自动确定聚类中心的密度峰值算法

         

摘要

密度峰值聚类算法(Density Peaks Clustering,DPC),是一种基于密度的聚类算法,该算法具有不需要指定聚类参数,能够发现非球状簇等优点.针对密度峰值算法凭借经验计算截断距离dc无法有效应对各个场景并且密度峰值算法人工选取聚类中心的方式难以准确获取实际聚类中心的缺陷,提出了一种基于基尼指数的自适应截断距离和自动获取聚类中心的方法,可以有效解决传统的DPC算法无法处理复杂数据集的缺点.该算法首先通过基尼指数自适应截断距离dc,然后计算各点的簇中心权值,再用斜率的变化找出临界点,这一策略有效避免了通过决策图人工选取聚类中心所带来的误差.实验表明,新算法不仅能够自动确定聚类中心,而且比原算法准确率更高.%Density Peaks Clustering(DPC)is a density-based clustering algorithm,which has the advantage of not need-ing to specify clustering parameters and discovering non-spherical clusters.In this paper,an adaptive truncation method based on Gini index is proposed to solve the problem that the density peak algorithm can not effectively deal with each scene by calculating the cutoff distance dc,and the density peak algorithm manually selects the clustering center to get the actual clustering center.Distance dcand automatic clustering center method can effectively solve the defects of tradi-tional DPC algorithm which can not handle the complex data set.The algorithm firstly cuts off the distance through Gini index,then calculates the cluster center weights of each point,and then uses the change of slope to find the critical point. This strategy effectively avoids the errors caused by manual selection of clustering centers by decision graph. Experi-ments show that the new algorithm not only can automatically determine the clustering center,but also has higher accuracy than the original algorithm.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号