首页> 外文会议>IEEE Conference on Computer Vision and Pattern Recognition >Closed-Form Training of Mahalanobis Distance for Supervised Clustering
【24h】

Closed-Form Training of Mahalanobis Distance for Supervised Clustering

机译:用于监督聚类的马氏距离的封闭式训练

获取原文

摘要

Clustering is the task of grouping a set of objects so that objects in the same cluster are more similar to each other than to those in other clusters. The crucial step in most clustering algorithms is to find an appropriate similarity metric, which is both challenging and problem-dependent. Supervised clustering approaches, which can exploit labeled clustered training data that share a common metric with the test set, have thus been proposed. Unfortunately, current metric learning approaches for supervised clustering do not scale to large or even medium-sized datasets. In this paper, we propose a new structured Mahalanobis Distance Metric Learning method for supervised clustering. We formulate our problem as an instance of large margin structured prediction and prove that it can be solved very efficiently in closed-form. The complexity of our method is (in most cases) linear in the size of the training dataset. We further reveal a striking similarity between our approach and multivariate linear regression. Experiments on both synthetic and real datasets confirm several orders of magnitude speedup while still achieving state-of-the-art performance.
机译:群集是对一组对象进行分组的任务,以使同一群集中的对象彼此之间的相似性高于其他群集中的对象。大多数聚类算法中的关键步骤是找到一个既具有挑战性又取决于问题的合适的相似性度量。因此,提出了一种监督聚类方法,该方法可以利用与测试集共享同一度量的标记聚类训练数据。不幸的是,当前用于监督聚类的度量学习方法无法扩展到大型甚至中型数据集。在本文中,我们提出了一种新的结构化马氏距离度量学习方法,用于监督聚类。我们将问题公式化为大幅度结构化预测的实例,并证明可以非常有效地以封闭形式解决该问题。在大多数情况下,我们的方法的复杂度在训练数据集的大小上是线性的。我们进一步揭示了我们的方法与多元线性回归之间的惊人相似性。在合成数据集和真实数据集上进行的实验均证实了几个数量级的加速,同时仍实现了最先进的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号