【24h】

Dimension induced clustering

机译:尺寸诱导聚类

获取原文

摘要

It is commonly assumed that high-dimensional datasets contain points most of which are located in low-dimensional manifolds. Detection of low-dimensional clusters is an extremely useful task for performing operations such as clustering and classification, however, it is a challenging computational problem. In this paper we study the problem of finding subsets of points with low intrinsic dimensionality. Our main contribution is to extend the definition of fractal correlation dimension, which measures average volume growth rate, in order to estimate the intrinsic dimensionality of the data in local neighborhoods. We provide a careful analysis of several key examples in order to demonstrate the properties of our measure. Based on our proposed measure, we introduce a novel approach to discover clusters with low dimensionality. The resulting algorithms extend previous density based measures, which have been successfully used for clustering. We demonstrate the effectiveness of our algorithms for discovering low-dimensional m-flats embedded in high dimensional spaces, and for detecting low-rank sub-matrices.
机译:通常假定高维数据集包含的点大部分位于低维流形中。低维聚类的检测对于执行诸如聚类和分类之类的操作是极其有用的任务,但是,这是一个具有挑战性的计算问题。在本文中,我们研究了寻找具有低固有维数的点子集的问题。我们的主要贡献是扩展了分形相关维的定义,该维测量了平均体积增长率,以便估计局部邻域中数据的固有维数。我们对几个关键示例进行了仔细的分析,以证明我们的措施的性质。基于我们提出的措施,我们介绍了一种发现低维聚类的新颖方法。生成的算法扩展了先前基于密度的度量,该度量已成功用于聚类。我们证明了我们的算法对于发现嵌入高维空间的低维m-flats以及检测低秩子矩阵的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号