An efficient grid-based clustering method by finding density peaks

机译：通过找到密度峰值的基于基于网格的聚类方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Clustering or categorizing an unprocessed data set is essential and critical in many areas. Much success has been published, which first needs to calculate the mutual distances between data points. It suffers from considerable computational costs, preventing the state-of-the-art methods such as the clustering method by fast search and find of density peaks (FSFDP, published in Science, 2014) from applying into real life (e.g., with thousands of data points). In this paper, an efficient grid-based clustering (GBC) method by finding density peaks is described. It keeps the advantage of the friendly interactive interface in the FSFDP, at the mean time, decreases enormously the computation complexity. The time complexity of the FSFDP is o(np(np - 1)/2) while our method decreases it to o(np * size of (grid)), where np is the number of data points and the size of grid is always much smaller than np so that the time complexity of our approach is almost linearly proportional to np. The presented GBC method by finding density peaks was able to calculate the densities and categorize datasets within much less time, which makes the density-peak-based algorithm practical. By using the presented algorithm, it was possible to cluster high-dimensional data sets as well. The GBC method by finding density peaks was successfully verified in clustering several datasets, which are commonly used to test clustering algorithms in published articles. It turned out that the presented method is much faster and efficient in clustering datasets into different categories than the conventional density-based ones, which makes the proposed method more preferable.

机译：群集或分类未处理的数据集是在许多领域的必不可少的且重要的。已经发布了大量成功，首先需要计算数据点之间的相互距离。它遭受了相当大的计算成本，防止了最先进的方法，例如通过快速搜索和查找密度峰（FSFDP，2014年发布的FSFDP）施加现实生活（例如，以成千上万数据点）。在本文中，描述了通过找到密度峰值的基于基于网格的聚类（GBC）方法。它在平均时间保持FSFDP中友好交互界面的优势，这使得计算复杂性极大地降低。 FSFDP的时间复杂性是O（NP（NP - 1）/ 2），而我们的方法将其降低到O（NP *大小（网格）），其中NP是数据点的数量，并且始终是网格的大小小于NP，因此我们方法的时间复杂性几乎是线性成比例与NP。通过找到密度峰值的呈现的GBC方法能够计算密度并在更短的时间内进行分类数据集，这使得基于密度峰值的算法实用。通过使用呈现的算法，也可以纳入高维数据集。通过查找密度峰值的GBC方法在聚类多个数据集中成功验证，该数据集通常用于测试已发布的文章中的聚类算法。事实证明，呈现的方法比传统的基于密度基于不同类别的聚类数据集更快，效率高，使得提出的方法更优选。

著录项

来源
《Annual Conference of the IEEE Industrial Electronics Society》|2016年|630p|共6页
会议地点
作者
Bo Wu; B. M. Wilamowski;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TN1-53;
关键词
Clustering algorithms; Clustering methods; Standards; Time complexity; Algorithm design and analysis; Shape;

机译：聚类算法;聚类方法;标准;时间复杂性;算法设计和分析;形状;

相似文献

外文文献
中文文献
专利

1. Secure grid-based density peaks clustering on hybrid cloud for industrial IoT [J] . Sun Liping, Ci Shang, Liu Xiaoqing, International Journal of Network Management . 2021,第2期

机译：基于牢固的基于网格的密度峰集聚类在工业物联网上的混合云上
2. An improved density peaks clustering algorithm with fast finding cluster centers [J] . Xiao Xu, Shifei Ding, Zhongzhi Shi Knowledge-Based Systems . 2018,第OCTa15期

机译：一种具有快速发现聚类中心的改进的密度峰聚类算法
3. Adaptive Partitioning by Local Density-Peaks: An Efficient Density-Based Clustering Algorithm for Analyzing Molecular Dynamics Trajectories [J] . Liu Song, Zhu Lizhe, Sheong Fu Kit, Journal of Computational Chemistry: Organic, Inorganic, Physical, Biological . 2017,第3a4期

机译：通过局部密度峰值的自适应分区：一种高效的基于密度的聚类算法，用于分析分子动力学轨迹
4. An efficient grid-based clustering method by finding density peaks [C] . Bo Wu, B. M. Wilamowski Annual Conference of the IEEE Industrial Electronics Society . 2016

机译：通过找到密度峰值的有效的基于网格的聚类方法
5. Efficient grid-based techniques for density functional theory . [D] . Rodriguez-Hernandez, Juan Ignacio. 2008

机译：高效的基于网格的密度泛函理论技术。
6. flowPeaks: a fast unsupervised clustering for flow cytometry data via K-means and density peak finding [O] . Yongchao Ge, Stuart C. Sealfon -1

机译：flowPeaks：通过K均值和密度峰发现对流式细胞术数据进行快速无监督的聚类
7. Efficient Clustering Method Based on Density Peaks With Symmetric Neighborhood Relationship [O] . Chunrong Wu, Jia Lee, Teijiro Isokawa, 2019

机译：基于密度峰值与对称邻域关系的有效聚类方法

An efficient grid-based clustering method by finding density peaks

摘要

著录项

相似文献

相关主题

期刊订阅