【24h】

Data Field for Hierarchical Clustering

机译:分层聚类的数据字段

获取原文
获取原文并翻译 | 示例
       

摘要

In this paper, data field'is proposed'to group data objects via simulating their mutual interactions and opposite movements forhierarchical clustering. Enlightened by the fieldin physicalspace, data field to simulate nuclear field is presented to illuminate the interaction between objects in data space. In the data field, the self-organized process of equipotential lines on many data objects discovers their hierarchical clustering-characteristics. During the clustering process, a random sample is first generated to optimize the impact factor. The masses of data objects are then estimated to select core data object with nonzero masses. Taking the core data objects as the initial clusters, the clusters are iteratively merged hierarchy by hierarchy with good performance. The results of a case study show that the data field is capable of hierarchical clustering on objects varying size, shape or granularity without user-specified parameters, as well as considering the object features inside the clusters and removing the outliers from noisy data. The comparisons illustrate that the data field clustering performs better than K-means, BIRCH, CURE, and CHAMELEON.
机译:在本文中,提出了“数据字段”,通过模拟它们的相互交互和相反的运动来对数据对象进行分组,以进行分层聚类。在物理空间领域的启发下,提出了模拟核场的数据域,以阐明数据空间中对象之间的相互作用。在数据字段中,许多数据对象上的等势线的自组织过程发现了它们的分层聚类特征。在聚类过程中,首先会生成一个随机样本以优化影响因子。然后估计数据对象的质量以选择具有非零质量的核心数据对象。以核心数据对象为初始集群,集群以良好的性能逐层迭代地合并。案例研究的结果表明,该数据字段能够对大小,形状或粒度不同的对象进行层次聚类,而无需用户指定参数,并且能够考虑聚类中的对象特征并从嘈杂数据中消除异常值。比较表明,数据字段聚类的性能优于K-means,BIRCH,CURE和CHAMELEON。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号