首页> 外文期刊>International Journal of Pattern Recognition and Artificial Intelligence >Automatic Density Peaks Clustering Using DNA Genetic Algorithm Optimized Data Field and Gaussian Process
【24h】

Automatic Density Peaks Clustering Using DNA Genetic Algorithm Optimized Data Field and Gaussian Process

机译:使用DNA遗传算法优化数据场和高斯过程的自动密度峰聚类

获取原文
获取原文并翻译 | 示例

摘要

Clustering by fast search and finding of Density Peaks ( called as DPC) introduced by Alex Rodriguez and Alessandro Laio attracted much attention in the field of pattern recognition and artificial intelligence. However, DPC still has a lot of defects that are not resolved. Firstly, the local density rho(i) of point i is affected by the cutoff distance dc, which can influence the clustering result, especially for small real-world cases. Secondly, the number of clusters is still found intuitively by using the decision diagram to select the cluster centers. In order to overcome these defects, this paper proposes an automatic density peaks clustering approach using DNA genetic algorithm optimized data field and Gaussian process (referred to as ADPC-DNAGA). ADPC-DNAGA can extract the optimal value of threshold with the potential entropy of data field and automatically determine the cluster centers by Gaussian method. For any data set to be clustered, the threshold can be calculated from the data set objectively rather than the empirical estimation. The proposed clustering algorithm is benchmarked on publicly available synthetic and real-world datasets which are commonly used for testing the performance of clustering algorithms. The clustering results are compared not only with that of DPC but also with that of several well-known clustering algorithms such as Affinity Propagation, DBSCAN and Spectral Cluster. The experimental results demonstrate that our proposed clustering algorithm can find the optimal cutoff distance d(c), to automatically identify clusters, regardless of their shape and dimension of the embedded space, and can often outperform the comparisons.
机译:Alex Rodriguez和Alessandro Laio引入的通过快速搜索和查找密度峰(称为DPC)进行聚类引起了模式识别和人工智能领域的广泛关注。但是,DPC仍然存在许多无法解决的缺陷。首先,点i的局部密度rho(i)受截止距离dc的影响,这可能会影响聚类结果,尤其是对于较小的实际情况。其次,通过使用决策图选择聚类中心,仍然可以直观地找到聚类数量。为了克服这些缺陷,本文提出了一种利用DNA遗传算法优化数据字段和高斯过程(称为ADPC-DNAGA)的自动密度峰聚类方法。 ADPC-DNAGA可以利用数据场的潜在熵提取阈值的最佳值,并通过高斯方法自动确定聚类中心。对于要聚类的任何数据集,可以客观地根据数据集而不是根据经验估算来计算阈值。所提出的聚类算法以公共可用的合成和真实数据集为基准,后者通常用于测试聚类算法的性能。不仅将聚类结果与DPC的聚类结果进行比较,而且还将其与几种众所周知的聚类算法(如亲和传播,DBSCAN和光谱聚类)进行比较。实验结果表明,我们提出的聚类算法可以找到最佳截止距离d(c),无论聚簇的形状和尺寸如何,都可以自动识别聚类,并且通常可以胜过比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号