首页> 外文期刊>Pattern recognition letters >Probabilistic density-based estimation of the number of clusters using the DBSCAN-martingale process
【24h】

Probabilistic density-based estimation of the number of clusters using the DBSCAN-martingale process

机译:使用DBSCAN-martingale流程基于概率的簇数估计

获取原文
获取原文并翻译 | 示例
       

摘要

Density-based clustering is an effective clustering approach that groups together dense patterns in low-and high-dimensional vectors, especially when the number of clusters is unknown. Such vectors are obtained for example when computer scientists represent unstructured data and then groups them into clusters in an unsupervised way. Another facet of clustering similar artifacts is the detection of densely connected nodes in network structures, where communities of nodes are formulated and need to be identified. To that end, we propose a new DBSCAN algorithm for estimating the number of clusters by optimizing a probabilistic process, namely DBSCAN-Martingale, which involves randomness in the selection of density parameter. We minimize the number of iterations required to extract all clusters by the DBSCAN-Martingale process, by providing an analytic formula. Experiments on spatial, textual and visual clustering show that the proposed analytic formula provides a suitable indicator for the optimal number of required iterations to extract all clusters. (C) 2019 Elsevier B.V. All rights reserved.
机译:基于密度的聚类是一种有效的聚类方法,可将低维和高维向量中的密集模式组合在一起,尤其是在簇数未知的情况下。例如,当计算机科学家表示非结构化数据,然后以无监督方式将其分组时,便获得了此类向量。聚集相似工件的另一个方面是检测网络结构中密集连接的节点,在该结构中需要制定节点社区,并需要对其进行标识。为此,我们提出了一种新的DBSCAN算法,用于通过优化概率过程来估计簇数,即DBSCAN-Martingale,它涉及密度参数选择的随机性。通过提供解析公式,我们最小化了DBSCAN-Martingale流程提取所有集群所需的迭代次数。空间,文本和视觉聚类实验表明,所提出的解析公式为提取所有聚类所需的最佳迭代次数提供了合适的指标。 (C)2019 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号