【24h】

Clustering from Constraint Graphs

机译:从约束图中群集

获取原文

摘要

In constrained clustering it is common to model the pairwise constraints as edges on the graph of observations. Using results from graph theory, we analyze such constraint graphs in two contexts, both of immediate value to practitioners. First, we explore the issue of constraint noise under several intuitive noise models. We apply results from random graph theory, which facilitate the analysis of finite-sized graphs and realistic data partitions and noise levels, to obtain a quantification of the effect noisy edges may have on any constrained clustering algorithm under a set of commonly-used assumptions. We also demonstrate the dangers in the common practice of connected-component constraint set augmentation, when used in the presence of noise. Second, we describe two practical randomized algorithms that estimate the number of induced clusters using only a small number of constraints. We conclude with an experimental evaluation that shows the effect of noise on common UCI data sets, as well as some aspects of the behavior of our algorithms.
机译:在约束群集中,通常将成对约束模拟作为观察图上的边缘。使用图形理论的结果,我们在两个语境中分析了这种约束图,两者都是直接价值到从业者。首先,我们在几种直观的噪声模型下探讨约束噪声问题。我们应用来自随机图理论的结果,其便于分析有限尺寸的图表和现实数据分区和噪声水平,以获得效果噪声边缘的量化可能在一组共同使用的假设下的任何受限聚类算法上。当在存在噪声时使用时,我们还展示了连接组件约束集合的常见做法中的危险。其次,我们描述了两个实际的随机算法,其仅使用少量约束来估计感应簇的数量。我们得出了一个实验评价,显示了噪声对普通UCI数据集的影响,以及我们算法行为的某些方面。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号