Escaping the curse of dimensionality in similarity learning: Efficient Frank-Wolfe algorithm and generalization bounds

Liu Kuan; Bellet Aurelien

首页> 外文期刊>Neurocomputing >Escaping the curse of dimensionality in similarity learning: Efficient Frank-Wolfe algorithm and generalization bounds

【24h】

Escaping the curse of dimensionality in similarity learning: Efficient Frank-Wolfe algorithm and generalization bounds

机译：逃避相似性学习中的维度诅咒：高效的Frank-Wolfe算法和泛化边界

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Similarity and metric learning provides a principled approach to construct a task-specific similarity from weakly supervised data. However, these methods are subject to the curse of dimensionality: as the number of features grows large, poor generalization is to be expected and training becomes intractable due to high computational and memory costs. In this paper, we propose a similarity learning method that can efficiently deal with high-dimensional sparse data. This is achieved through a parameterization of similarity functions by convex combinations of sparse rank-one matrices, together with the use of a greedy approximate Frank-Wolfe algorithm which provides an efficient way to control the number of active features. We show that the convergence rate of the algorithm, as well as its time and memory complexity, are independent of the data dimension. We further provide a theoretical justification of our modeling choices through an analysis of the generalization error, which depends logarithmically on the sparsity of the solution rather than on the number of features. Our experiments on datasets with up to one million features demonstrate the ability of our approach to generalize well despite the high dimensionality as well as its superiority compared to several competing methods. (C) 2018 Elsevier B.V. All rights reserved.

机译：相似度和度量学习提供了一种从弱监督的数据构造特定于任务的相似度的原则方法。但是，这些方法会受到维度的诅咒：随着特征数量的增加，普遍的预期不佳，并且由于高昂的计算和存储成本，训练变得棘手。在本文中，我们提出了一种可以有效处理高维稀疏数据的相似性学习方法。这是通过稀疏秩一矩阵的凸组合对相似度函数进行参数化，以及使用贪婪近似Frank-Wolfe算法来实现的，该算法提供了一种有效的方式来控制活动特征的数量。我们表明，算法的收敛速度及其时间和内存复杂度与数据维无关。通过对泛化误差的分析，我们进一步为我们的建模选择提供了理论依据，而泛化误差在对数上取决于解决方案的稀疏性，而不取决于特征的数量。我们对具有多达一百万个特征的数据集进行的实验表明，尽管该方法具有很高的维数以及与几种竞争方法相比的优越性，但该方法能够很好地概括。（C）2018 Elsevier B.V.保留所有权利。

著录项

来源
《Neurocomputing》 |2019年第14期|185-199|共15页
作者
Liu Kuan; Bellet Aurelien;
展开▼
作者单位

Google Inc, Mountain View, CA USA;

INRIA, Villers Les Nancy, France;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Metric learning; Frank-Wolfe algorithm; Generalization bounds;

机译：度量学习;Frank-Wolfe算法;泛化边界;

相似文献

外文文献
中文文献
专利

1. Generalization bounds for metric and similarity learning [J] . Cao Qiong, Guo Zheng-Chu, Ying Yiming Machine Learning . 2016,第1期

机译：度量和相似性学习的广义边界
2. Effective and Efficient Algorithms for Flexible Aggregate Similarity Search in High Dimensional Spaces [J] . Houle Michael E., Ma Xiguo, Oria Vincent Knowledge and Data Engineering, IEEE Transactions on . 2015,第12期

机译：高维空间中灵活的聚合相似度搜索的有效算法
3. Using the doubling dimension to analyze the generalization of learning algorithms [J] . Nader H. Bshouty, Yi Li, Philip M. Long Journal of computer and system sciences . 2009,第6期

机译：使用加倍维分析学习算法的泛化
4. A Distributed Frank-Wolfe Algorithm for Communication-Efficient Sparse Learning [C] . Aurelien Bellet, Yingyu Liang, Alireza Bagheri Garakani, SIAM International Conference on Data Mining . 2015

机译：一种分布式坦率的弗兰德沃尔夫算法，用于通信有效的稀疏学习
5. Efficient Algorithms for Frequent Path Finding and Similarity Join in Big Multidimensional Data [D] . Luo, Wuman 2012

机译：大多维数据中频繁路径查找和相似联接的高效算法
6. Explaining Compound Generalization in Associative and Causal Learning Through Rational Principles of Dimensional Generalization [O] . Fabian A. Soto, Samuel J. Gershman, Yael Niv -1

机译：通过维度泛化的合理原理解释联想学习和因果学习中的复合泛化
7. A Distributed Frank-Wolfe Algorithm for Communication-Efficient Sparse Learning [O] . Yingyu Liang, Alireza Bagheri, Garakani Maria-florina Balcan, 2016

机译：一种用于通信高效稀疏学习的分布式Frank-Wolfe算法

Escaping the curse of dimensionality in similarity learning: Efficient Frank-Wolfe algorithm and generalization bounds

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅