...
首页> 外文期刊>Pattern Analysis and Applications >ASCRCIu: an adaptive subspace combination and reduction algorithm for clustering of high-dimensional data
【24h】

ASCRCIu: an adaptive subspace combination and reduction algorithm for clustering of high-dimensional data

机译:ASCRCIU:一种自适应子空间组合和用于聚类高维数据的算法

获取原文
获取原文并翻译 | 示例
           

摘要

The curse of dimensionality in high-dimensional data is one of the major challenges in data clustering. Recently, a considerable amount of literature has been published on subspace clustering to address this challenge. The main objective of the subspace clustering is to discover clusters embedded in any possible combination of the attributes. Previous studies have mostly been generating redundant subspace clusters, leading to clustering accuracy loss and also increasing the running time. In this paper, a bottom-up density-based approach is proposed for clustering of high-dimensional data. We employ the cluster structure as a similarity measure to generate the optimal subspaces which result in raising the accuracy of the subspace clustering. Using this idea, we propose an iterative algorithm to discover similar subspaces using the similarity in the features of subspaces. At each iteration of this algorithm, it first determines similar subspaces, then combines them to generate higher-dimensional subspaces, and finally re-clusters the subspaces. The algorithm repeats these steps and converges to the final clusters. Experiments on various synthetic and real datasets show that the results of the proposed approach are significantly better in both quality and runtime comparing to the state of the art on clustering high-dimensional data. The accuracy of the proposed method is around 34% higher than the CLIQUE algorithm and around 6% higher than DiSH.
机译:高维数据中的维度的诅咒是数据聚类中的主要挑战之一。最近,已经在子空间聚类上发表了相当数量的文献,以解决这一挑战。子空间聚类的主要目标是发现嵌入属性的任何可能组合的集群。以前的研究主要是产生冗余子空间集群,导致聚类精度损失并增加运行时间。本文提出了一种用于聚类高维数据的自下而上的密度方法。我们使用集群结构作为相似度量,以生成最佳子空间,导致培养子空间聚类的准确性。使用此想法,我们提出了一种迭代算法来使用子空间特征中的相似性来发现类似的子空间。在该算法的每次迭代时,它首先确定类似的子空间,然后将它们组合以生成高维子空间,并且最终将子空间重新簇重新簇。该算法重复这些步骤并收敛到最终群集。各种合成和实时数据集的实验表明,与群体高维数据的技术的质量和运行时,所提出的方法的结果显着更好。所提出的方法的准确性比Clique算法高约34%,比盘高约6%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号