首页> 外文会议>IEEE International Workshop on Machine Learning for Signal Processing >Progressive clustering of manifold-modeled data based on tangent space variations
【24h】

Progressive clustering of manifold-modeled data based on tangent space variations

机译:基于切线空间变化的流形建模数据进行逐步聚类

获取原文

摘要

An important research topic of the recent years has been to understand and analyze manifold-modeled data for clustering and classification applications. Most clustering methods developed for data of non-linear and low-dimensional structure are based on local linearity assumptions. However, clustering algorithms based on locally linear representations can tolerate difficult sampling conditions only to some extent, and may fail for scarcely sampled data manifolds or at high-curvature regions. In this paper, we consider a setting where each cluster is concentrated around a manifold and propose a manifold clustering algorithm that relies on the observation that the variation of the tangent space must be consistent along curves over the same data manifold. In order to achieve robustness against challenges due to noise, manifold intersections, and high curvature, we propose a progressive clustering approach: Observing the variation of the tangent space, we first detect the non-problematic manifold regions and form pre-clusters with the data samples belonging to such reliable regions. Next, these pre-clusters are merged together to form larger clusters with respect to constraints on both the distance and the tangent space variations. Finally, the samples identified as problematic are also assigned to the computed clusters to finalize the clustering. Experiments with synthetic and real datasets show that the proposed method outperforms the manifold clustering algorithms in comparison based on Euclidean distance and sparse representations.
机译:近年来的一个重要研究课题是理解和分析用于聚类和分类应用的流形建模的数据。为非线性和低维结构数据开发的大多数聚类方法都基于局部线性假设。然而,基于局部线性表示的聚类算法仅在某种程度上可以容忍困难的采样条件,并且可能在几乎没有采样的数据歧管或高曲率区域中失败。在本文中,我们考虑一个设置,其中每个簇围绕歧管集中,并提出依赖于观察的歧管聚类算法,即切线空间的变化必须沿相同数据歧管上的曲线一致。为了实现由于噪声,歧管交叉口和高曲率导致的挑战,我们提出了一种渐进的聚类方法:观察切线空间的变化,首先检测非问题的歧管区域并与数据形成预簇属于这种可靠地区的样品。接下来,将这些预簇合并在一起以相对于距离和切线空间变化的约束形成更大的簇。最后,还将标识为有问题的样本分配给计算的群集以完成群集。具有合成和实时数据集的实验表明,基于欧几里德距离和稀疏表示,所提出的方法比较歧群聚类算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号