首页> 外文会议>European conference on machine learning and knowledge discovery in databases >A Kernel-Learning Approach to Semi-supervised Clustering with Relative Distance Comparisons
【24h】

A Kernel-Learning Approach to Semi-supervised Clustering with Relative Distance Comparisons

机译:相对距离比较的半监督聚类的核学习方法

获取原文

摘要

We consider the problem of clustering a given dataset into k clusters subject to an additional set of constraints on relative distance comparisons between the data items. The additional constraints are meant to reflect side-information that is not expressed in the feature vectors, directly. Relative comparisons can express structures at finer level of detail than must-link (ML) and cannot-link (CL) constraints that are commonly used for semi-supervised clustering. Relative comparisons are particularly useful in settings where giving an ML or a CL constraint is difficult because the granularity of the true clustering is unknown. Our main contribution is an efficient algorithm for learning a kernel matrix using the log determinant divergence (a variant of the Bregman divergence) subject to a set of relative distance constraints. Given the learned kernel matrix, a clustering can be obtained by any suitable algorithm, such as kernel k-means. We show empirically that kernels found by our algorithm yield clusterings of higher quality than existing approaches that either use ML/CL constraints or a different means to implement the supervision using relative comparisons.
机译:我们考虑将给定数据集聚类为k个聚类的问题,这要受到数据项之间相对距离比较的一组附加约束。附加约束旨在直接反映特征向量中未表达的辅助信息。相对比较可以比半监督聚类中通常使用的必须链接(ML)和不能链接(CL)约束更详细地表示结构。相对比较在难以给出ML或CL约束的设置中特别有用,因为真实聚类的粒度未知。我们的主要贡献是一种有效的算法,该算法使用对数行列式发散度(B​​regman发散度的变体)来学习内核矩阵,该算法受一组相对距离约束的影响。给定学习的内核矩阵,可以通过任何合适的算法(例如内核k均值)获得聚类。我们从经验上证明,与使用ML / CL约束或使用相对比较来实施监督的现有方法相比,我们的算法发现的内核能够产生更高质量的聚类。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号