首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Active Learning of Constraints for Semi-Supervised Clustering
【24h】

Active Learning of Constraints for Semi-Supervised Clustering

机译:主动学习半监督聚类的约束

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Semi-supervised clustering aims to improve clustering performance by considering user supervision in the form of pairwise constraints. In this paper, we study the active learning problem of selecting pairwise must-link and cannot-link constraints for semi-supervised clustering. We consider active learning in an iterative manner where in each iteration queries are selected based on the current clustering solution and the existing constraint set. We apply a general framework that builds on the concept of neighborhood, where neighborhoods contain "labeled examples" of different clusters according to the pairwise constraints. Our active learning method expands the neighborhoods by selecting informative points and querying their relationship with the neighborhoods. Under this framework, we build on the classic uncertainty-based principle and present a novel approach for computing the uncertainty associated with each data point. We further introduce a selection criterion that trades off the amount of uncertainty of each data point with the expected number of queries (the cost) required to resolve this uncertainty. This allows us to select queries that have the highest information rate. We evaluate the proposed method on the benchmark data sets and the results demonstrate consistent and substantial improvements over the current state of the art.
机译:半监督聚类旨在通过考虑成对约束形式的用户监督来提高聚类性能。在本文中,我们研究为半监督聚类选择成对的必须链接和不能链接约束的主动学习问题。我们以迭代方式考虑主动学习,其中在每次迭代中,根据当前的聚类解决方案和现有的约束集选择查询。我们应用基于邻域概念的通用框架,其中邻域包含根据成对约束的不同聚类的“标记示例”。我们的主动学习方法通​​过选择信息点并查询其与邻域的关系来扩展邻域。在此框架下,我们基于经典的基于不确定性的原理,并提出了一种新颖的方法来计算与每个数据点相关的不确定性。我们进一步介绍了一种选择标准,该标准将每个数据点的不确定性量与解决该不确定性所需的预期查询数(成本)之间进行权衡。这使我们可以选择信息率最高的查询。我们在基准数据集上评估了提出的方法,结果证明了在当前技术水平上的一致且实质性的改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号