【24h】

On Tree-Based Methods for Similarity Learning

机译:论相似性学习的基于树的方法

获取原文

摘要

In many situations, the choice of an adequate similarity measure or metric on the feature space dramatically determines the performance of machine learning methods. Building automatically such measures is the specific purpose of metric/similarity learning. In [21], similarity learning is formulated as a pairwise bipartite ranking problem: ideally, the larger the probability that two observations in the feature space belong to the same class (or share the same label), the higher the similarity measure between them. From this perspective, the ROC curve is an appropriate performance criterion and it is the goal of this article to extend recursive tree-based ROC optimization techniques in order to propose efficient similarity learning algorithms. The validity of such iterative partitioning procedures in the pairwise setting is established by means of results pertaining to the theory of U-processes and from a practical angle, it is discussed at length how to implement them by means of splitting rules specifically tailored to the similarity learning task. Beyond these theoretical/methodological contributions, numerical experiments are displayed and provide strong empirical evidence of the performance of the algorithmic approaches we propose.
机译:在许多情况下,特征空间的适当相似度测量或度量的选择显着地确定了机器学习方法的性能。建设自动这些措施是度量/相似度学习的具体目的。在[21]中,相似性学习被制定为成对二分位排名问题:理想情况下,特征空间中的两个观察的概率越大,它们属于同一类(或共享相同标签),它们之间的相似度测量越高。从这个角度来看,ROC曲线是一个适当的性能标准,这是本文的目标是扩展基于树的ROC优化技术,以提出有效的相似性学习算法。通过与U-Process的理论和从实际角度相关的结果建立成对设置中这种迭代分区过程的有效性,其简要讨论如何通过专门针对相似性定制的拆分规则来实现它们学习任务。除了这些理论/方法论贡献之外,展示数值实验并提供了我们提出的算法方法性能的强大实证证据。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号