首页> 外文期刊>Neurocomputing >Fast and stable clustering analysis based on Grid-mapping K-means algorithm and new clustering validity index
【24h】

Fast and stable clustering analysis based on Grid-mapping K-means algorithm and new clustering validity index

机译:基于网格映射K-means算法和新的聚类有效性指标的快速稳定聚类分析

获取原文
获取原文并翻译 | 示例
       

摘要

As a classical data analysis technique, clustering plays the important role in identifying natural structures of target datasets. However, many of the existing clustering methods, including clustering algorithms and clustering validity indexes (CVIs), are still suffering from problems of low efficiency, poor clustering accuracy, poor stability and more sensitivity to noise points. In this paper, by mapping datasets to grids, the Grid-K-means algorithm is firstly proposed to overcome drawbacks of the traditional K-means algorithm. Then, by utilizing grid points as the weighted representative points to process datasets, a new clustering validity index (BCVI) is designed to better evaluate the quality of clustering results generated by the Grid-K-means algorithm. Based on the monotonous feature of BCVI and the linear combination of intra-cluster compactness and inter-cluster separation of clusters, BCVI consumes much lower time cost in finding the optimal clustering number (K-opt) than the commonly used method that utilizes the empirical rule K-max <=root n to calculate the K-opt. Experimental results on testing many types of datasets have demonstrated that the Grid-K-means algorithm is faster and more accurate than the traditional ones. Meanwhile, the experimental results on testing BCVI and seven existing CVIs have shown that the new BCVI is superior to the traditional ones in terms of clustering stability and data processing speed. (C) 2019 Elsevier B.V. All rights reserved.
机译:作为一种经典的数据分析技术,聚类在识别目标数据集的自然结构中起着重要的作用。但是,许多现有的聚类方法,包括聚类算法和聚类有效性指标(CVI),仍然存在效率低,聚类精度差,稳定性差以及对噪声点更加敏感的问题。本文通过将数据集映射到网格,提出了Grid-K-means算法,克服了传统的K-means算法的缺点。然后,通过使用网格点作为加权代表点来处理数据集,设计了一种新的聚类有效性指数(BCVI),以更好地评估由Grid-K-means算法生成的聚类结果的质量。基于BCVI的单调特征以及集群内紧密度和集群之间的线性分离的线性组合,BCVI在寻找最佳聚类数(K-opt)方面所花费的时间成本比利用经验方法的常用方法要低得多。规则K-max <= root n以计算K-opt。测试多种类型的数据集的实验结果表明,Grid-K-means算法比传统算法更快,更准确。同时,对BCVI和七个现有CVI进行测试的实验结果表明,新的BCVI在聚类稳定性和数据处理速度方面优于传统的BCVI。 (C)2019 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号