首页> 外文会议>Risk analysis IX >A kernel density smoothing method for determining an optimal number of clusters in continuous data

【24h】

A kernel density smoothing method for determining an optimal number of clusters in continuous data

机译：确定连续数据中最佳簇数的核密度平滑方法

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

While data clustering algorithms are becoming increasingly popular across scientific, industrial and social data mining applications, model complexity remains a major challenge. Most clustering algorithms do not incorporate a mechanism for finding an optimal scale parameter that corresponds to an appropriate number of clusters. We propose (BASINS~(-1)), a kernel-density smoothing-based approach to data clustering. Its main ideas derive from two unsupervised clustering approaches - kernel density estimation (KDE) and scale-spacing clustering (SSC). The novel method determines the optimal number of clusters by first finding dense regions in data before separating them based on data-dependent parameter estimates. The optimal number of clusters is determined from different levels of smoothing after the inherent number of arbitrary shape clusters has been detected without a priori information. We demonstrate the applicability of the proposed method under both nested and non-nested hierarchical clustering methodologies. Simulated and real data results are presented to validate the performance of the method, with repeated runs showing high accuracy and reliability.

机译：尽管数据聚类算法在科学，工业和社会数据挖掘应用程序中越来越受欢迎，但是模型复杂性仍然是一个主要挑战。大多数聚类算法不包含用于找到与适当数量的聚类相对应的最佳比例参数的机制。我们提出（BASINS〜（-1）），这是一种基于核密度平滑的数据聚类方法。它的主要思想来自两种无监督的聚类方法-内核密度估计（KDE）和尺度间距聚类（SSC）。该新方法通过首先在数据中找到密集区域，然后再根据与数据相关的参数估计值将它们分离，从而确定最佳的簇数。在没有先验信息的情况下检测到任意形状的簇的固有数量之后，从不同的平滑度级别确定簇的最佳数量。我们证明了该方法在嵌套和非嵌套层次聚类方法下的适用性。给出了模拟和真实数据结果，以验证该方法的性能，重复运行显示出较高的准确性和可靠性。

著录项

来源
《Risk analysis IX》|2014年|165-178|共14页
会议地点 New Forest(GB)
作者
J. Bugrien; K. Mwitondi; F. Shuweihdi;
展开▼
作者单位

Statistics Department, Benghazi University, Libya;

Department of Computing, Sheffield Hallam University, UK;

School of Mathematics, University of Leeds, UK;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
BASINS~(-1); data clustering; data mining; kernel density estimation; local optimization; scale-space clustering; supervised learning; unsupervised learning;

机译：脸盆〜（-1）;数据聚类；数据挖掘;核密度估计；局部优化尺度空间聚类；监督学习；无监督学习;

相似文献

外文文献
中文文献
专利

1. Optimal Smoothing of Kernel-Based Topographic Maps with Application to Density-Based Clustering of Shapes [J] . MARC M. VAN HULLE, TEMUJIN GAUTAMA Journal of VLSI signal processing . 2004,第2a3期

机译：基于核的地形图的最佳平滑及其在基于密度的形状聚类中的应用
2. Center-based clustering of categorical data using kernel smoothing methods [J] . Yan Xuanhui, Chen Lifei, Guo Gongde Frontiers of computer science in China . 2018,第5期

机译：使用核平滑方法的分类中心基于中心的聚类
3. The implementation of binned Kernel density estimation to determine open clusters' proper motions: validation of the method [J] . Priyatikanto R., Arifyanto M. I. Astrophysics and space science . 2015,第1期

机译：合并核密度估计以确定开放星团的正确运动的实现：方法的验证
4. A kernel density smoothing method for determining an optimal number of clusters in continuous data [C] . J. Bugrien, K. Mwitondi, F. Shuweihdi International Conference on Computer Simulation in Risk Analysis and Hazard Mitigation . 2014

机译：用于确定连续数据中最佳簇的最佳簇的核密度平滑方法
5. Image reconstruction of muon tomographic data using a density-based clustering method. [D] . Perry, Kimberly B. 2015

机译：使用基于密度的聚类方法对μ子层析成像数据进行图像重建。
6. Smooth statistical torsion angle potential derived from a large conformational database via adaptive kernel density estimation improves the quality of NMR protein structures [O] . Guillermo A Bermejo, G Marius Clore, Charles D Schwieters 2012

机译：通过自适应核密度估计从大型构象数据库获得的平滑统计扭转角电势可提高NMR蛋白质结构的质量
7. The Implementation of Binned Kernel Density Estimation to Determine Open Clusters ’ Proper Motions: Validation of the Method [O] . R. Priyatikanto, M. I. Arifyanto 2016

机译：确定开放集群正确运动的Binned核密度估计的实现：方法的验证
8. Density Biased Sampling: An Improved Method for Data Mining and Clustering [R] . Palmer, C. R. , Faloutsos, C. 1999

机译：密度偏差抽样：一种改进的数据挖掘和聚类方法

A kernel density smoothing method for determining an optimal number of clusters in continuous data

摘要

著录项

相似文献

相关主题

期刊订阅