首页> 外文期刊>Journal of classification >Proximity Curves for Potential-Based Clustering
【24h】

Proximity Curves for Potential-Based Clustering

机译:基于潜在的聚类的接近曲线

获取原文
获取原文并翻译 | 示例
           

摘要

The concept of proximity curve and a new algorithm are proposed for obtaining clusters in a finite set of data points in the finite dimensional Euclidean space. Each point is endowed with a potential constructed by means of a multi-dimensional Cauchy density, contributing to an overall anisotropic potential function. Guided by the steepest descent algorithm, the data points are successively visited and removed one by one, and at each stage the overall potential is updated and the magnitude of its local gradient is calculated. The result is a finite sequence of tuples, theproximity curve, whose pattern is analysed to give rise to a deterministic clustering. The finite set of all such proximity curves in conjunction with a simulation study of their distribution results in aprobabilistic clusteringrepresented by a distribution on the set of dendrograms. A two-dimensional synthetic data set is used to illustrate the proposed potential-based clustering idea. It is shown that the results achieved are plausible since both the 'geographic distribution' of data points as well as the 'topographic features' imposed by the potential function are well reflected in the suggested clustering. Experiments using the Iris data set are conducted for validation purposes on classification and clustering benchmark data. The results are consistent with the proposed theoretical framework and data properties, and open new approaches and applications to consider data processing from different perspectives and interpret data attributes contribution to patterns.
机译:提出了接近曲线的概念和一种新的算法,用于在有限维欧氏空间的有限数据点集中获得聚类。每个点都被赋予一个通过多维柯西密度构造的势,从而形成一个整体各向异性势函数。在最速下降算法的指导下,依次访问和删除数据点,并在每个阶段更新总电位,计算其局部梯度的大小。结果是一个有限的元组序列,即近似曲线,对其模式进行分析以产生确定性聚类。所有这些接近曲线的有限集合,以及对其分布的模拟研究,导致了一种可能的聚集,这种聚集由树状图集合上的分布表示。一个二维合成数据集被用来说明所提出的基于潜力的聚类思想。结果表明,所获得的结果是合理的,因为数据点的“地理分布”以及势函数施加的“地形特征”都在建议的聚类中得到了很好的反映。为了验证分类和聚类基准数据,使用Iris数据集进行了实验。结果与所提出的理论框架和数据属性一致,并从不同的角度考虑数据处理的新方法和应用,并解释数据属性对模式的贡献。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号