【24h】

Learning with Equivalence Constraints and the Relation to Multiclass Learning

机译:具有等效约束的学习及其与多类别学习的关系

获取原文
获取原文并翻译 | 示例

摘要

We study the problem of learning partitions using equivalence constraints as input. This is a binary classification problem in the product space of pairs of datapoints. The training data includes pairs of datapoints which are labeled as coming from the same class or not. This kind of data appears naturally in applications where explicit labeling of datapoints is hard to get, but relations between datapoints can be more easily obtained, using, for example, Markovian dependency (as in video clips). Our problem is an unlabeled partition problem, and is therefore tightly related to multiclass classification. We show that the solutions of the two problems are related, in the sense that a good solution to the binary classification problem entails the existence of a good solution to the multiclass problem, and vice versa. We also show that bounds on the sample complexity of the two problems are similar, by showing that their relevant 'dimensions' (VC dimension for the binary problem, Natarajan dimension for the multiclass problem) bound each other. Finally, we show the feasibility of solving multiclass learning efficiently by using a solution of the equivalent binary classification problem. In this way advanced techniques developed for binary classification, such as SVM and boosting, can be used directly to enhance multiclass learning.
机译:我们研究了使用等价约束作为输入来学习分区的问题。这是成对的数据点对的乘积空间中的二进制分类问题。训练数据包括成对的数据点,它们被标记为来自或不属于同一类。这类数据自然出现在难以获得明确标记数据点的应用程序中,但是可以使用例如马尔可夫依赖项(如视频剪辑)更轻松地获取数据点之间的关系。我们的问题是未标记的分区问题,因此与多类分类紧密相关。我们表明这两个问题的解决方案是相关的,从某种意义上说,对二元分类问题的好的解决方案意味着对多类问题的好的解决方案的存在,反之亦然。我们还表明,通过显示两个问题的相关“维度”(二进制问题的VC维,多类问题的Natarajan维),两个问题的样本复杂度界限相似。最后,我们展示了通过使用等效二元分类问题的解决方案有效解决多类学习的可行性。这样,为二进制分类开发的高级技术(例如SVM和Boosting)可以直接用于增强多类学习。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号