A kernel density smoothing method for determining an optimal number of clusters in continuous data

机译：用于确定连续数据中最佳簇的最佳簇的核密度平滑方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

While data clustering algorithms are becoming increasingly popular across scientific, industrial and social data mining applications, model complexity remains a major challenge. Most clustering algorithms do not incorporate a mechanism for finding an optimal scale parameter that corresponds to an appropriate number of clusters. We propose (BASINS~(-1)), a kernel-density smoothing-based approach to data clustering. Its main ideas derive from two unsupervised clustering approaches - kernel density estimation (KDE) and scale-spacing clustering (SSC). The novel method determines the optimal number of clusters by first finding dense regions in data before separating them based on data-dependent parameter estimates. The optimal number of clusters is determined from different levels of smoothing after the inherent number of arbitrary shape clusters has been detected without a priori information. We demonstrate the applicability of the proposed method under both nested and non-nested hierarchical clustering methodologies. Simulated and real data results are presented to validate the performance of the method, with repeated runs showing high accuracy and reliability.

机译：虽然数据聚类算法越来越受科学，工业和社交挖掘应用程序越来越受欢迎，但模型复杂性仍然是一个重大挑战。大多数聚类算法不包含用于查找与适当数量的群集对应的最佳刻度参数的机制。我们提出（盆地〜（-1）），基于内核密度平滑的数据聚类方法。其主要思想从两种无人监督的聚类方法 - 内核密度估计（KDE）和比例间距聚类（SSC）。在基于数据相关参数估计的情况下，通过首先在数据中找到密集区域来确定群集的最佳数量。在没有先验信息的情况下检测到任意形状簇的固有数量之后，从不同的平滑级别确定了簇的最佳数量。我们展示了所提出的方法在嵌套和非嵌套分层聚类方法下的适用性。提出了模拟和实际数据结果以验证该方法的性能，重复运行显示出高精度和可靠性。

著录项

来源
《International Conference on Computer Simulation in Risk Analysis and Hazard Mitigation》|2014年||共14页
会议地点
作者
J. Bugrien; K. Mwitondi; F. Shuweihdi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP391.9-53;
关键词
BASINS~(-1); Data clustering; Data mining; Kernel density estimation; Local optimization; Scale-space clustering; Supervised learning; Unsupervised learning;

机译：盆地〜（-1）;数据集群;数据挖掘;核密度估计;局部优化;规模空间聚类;监督学习;无监督学习;

相似文献

外文文献
中文文献
专利

1. Optimal Smoothing of Kernel-Based Topographic Maps with Application to Density-Based Clustering of Shapes [J] . MARC M. VAN HULLE, TEMUJIN GAUTAMA Journal of VLSI signal processing . 2004,第2a3期

机译：基于核的地形图的最佳平滑及其在基于密度的形状聚类中的应用
2. Center-based clustering of categorical data using kernel smoothing methods [J] . Yan Xuanhui, Chen Lifei, Guo Gongde Frontiers of computer science in China . 2018,第5期

机译：使用核平滑方法的分类中心基于中心的聚类
3. The implementation of binned Kernel density estimation to determine open clusters' proper motions: validation of the method [J] . Priyatikanto R., Arifyanto M. I. Astrophysics and space science . 2015,第1期

机译：合并核密度估计以确定开放星团的正确运动的实现：方法的验证
4. A kernel density smoothing method for determining an optimal number of clusters in continuous data [C] . J. Bugrien, K. Mwitondi, F. Shuweihdi Risk analysis IX . 2014

机译：确定连续数据中最佳簇数的核密度平滑方法
5. Image reconstruction of muon tomographic data using a density-based clustering method. [D] . Perry, Kimberly B. 2015

机译：使用基于密度的聚类方法对μ子层析成像数据进行图像重建。
6. Smooth statistical torsion angle potential derived from a large conformational database via adaptive kernel density estimation improves the quality of NMR protein structures [O] . Guillermo A Bermejo, G Marius Clore, Charles D Schwieters 2012

机译：通过自适应核密度估计从大型构象数据库获得的平滑统计扭转角电势可提高NMR蛋白质结构的质量
7. The Implementation of Binned Kernel Density Estimation to Determine Open Clusters ’ Proper Motions: Validation of the Method [O] . R. Priyatikanto, M. I. Arifyanto 2016

机译：确定开放集群正确运动的Binned核密度估计的实现：方法的验证
8. Density Biased Sampling: An Improved Method for Data Mining and Clustering [R] . Palmer, C. R. , Faloutsos, C. 1999

机译：密度偏差抽样：一种改进的数据挖掘和聚类方法

A kernel density smoothing method for determining an optimal number of clusters in continuous data

摘要

著录项

相似文献

相关主题

期刊订阅