Divide and Conquer Algorithms for Machine Learning

机译：机器学习的分而治之算法

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This thesis improves the scalability of machine learning by studying mergeable learning algorithms. In a mergeable algorithm, many processors independently solve the learning problem on small subsets of the data. Then a master processor merges the solutions together with only a single round of communication. Mergeable algorithms are popular because they are fast, easy to implement, and have strong privacy guarantees.;Our first contribution is a novel fast cross validation procedure suitable for any mergeable algorithm. This fast cross validation procedure has a constant runtime independent of the number of folds and can be implemented on distributed systems. This procedure is also widely applicable. We show that 32 recently proposed learning algorithms are mergeable and therefore fit our cross validation framework. These learning algorithms come from many subfields of machine learning, including density estimation, regularized loss minimization, dimensionality reduction, submodular optimization, variational inference, and markov chain monte carlo.;We also provide two new mergeable learning algorithms. In the context of regularized loss minimization, existing merge procedures either have high bias or slow runtimes. We introduce the optimal weighted average (OWA) merge procedure, which achieves both a low bias and fast runtime. We also improve the cover tree data structure for fast nearest neighbor queries by providing a merge procedure. In doing so, we improve both the theoretical guarantees of the cover tree and its practical runtime. For example, the original cover tree was able to find nearest neighbors in time O( cexp12 log n), and we improve this bound to O(chole 4 log n) for i.i.d. data. Here, c exp and chole are measures of the "intrinsic dimensionality'' of the data, and on typical datasets c exp > chole. Experiments on large scale ad-click, genomics, and image classification tasks empirically validate these algorithms.

机译：本文通过研究可合并学习算法提高了机器学习的可扩展性。在可合并算法中，许多处理器独立地解决了数据小子集上的学习问题。然后，主处理器仅通过一次通信就将解决方案合并在一起。可合并算法之所以受欢迎，是因为它们快速，易于实现并且具有强大的隐私保证。我们的第一贡献是适用于任何可合并算法的新颖的快速交叉验证程序。此快速交叉验证过程具有与折叠次数无关的恒定运行时间，并且可以在分布式系统上实现。该过程也广泛适用。我们表明，最近提出的32种学习算法是可合并的，因此适合我们的交叉验证框架。这些学习算法来自机器学习的许多子领域，包括密度估计，正则损失最小化，降维，子模优化，变分推理和马尔可夫链蒙特卡洛。我们还提供了两种新的可合并学习算法。在有规律的损失最小化的情况下，现有的合并过程具有较高的偏差或运行时间较慢。我们介绍了最佳加权平均（OWA）合并过程，该过程可实现低偏差和快速运行时间。通过提供合并过程，我们还改进了用于快速最近邻居查询的覆盖树数据结构。这样，我们既改进了覆盖树的理论保证，又提高了其实际运行时间。例如，原始的覆盖树能够在时间O（cexp12 log n）中找到最近的邻居，并且我们将其与i（i.d。数据。在这里，c exp和chole是数据的“固有维数”的度量，在典型的数据集c exp> chole上，大规模点击，基因组和图像分类任务的实验从经验上验证了这些算法。

著录项

作者
Izbicki, Michael John.;
展开▼
作者单位

University of California, Riverside.;

展开▼
授予单位 University of California, Riverside.;
学科 Information science.;Artificial intelligence.
学位 Ph.D.
年度 2017
页码 170 p.
总页数 170
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Medium-Term Forecasting of Loop Current Eddy Cameron and Eddy Darwin Formation in the Gulf of Mexico With a Divide-and-Conquer Machine Learning Approach [J] . Wang Justin L., Zhuang Hanqi, Cherubin Laurent M., Journal of Geophysical Research, C. Oceans: JGR . 2019 ,第8期

机译：墨西哥湾回路电流涡流和涡流形成的中期预测，采用划分征管机器学习方法
2. Divide-and-conquer signal processing, feature extraction, and machine learning for big data [J] . Chen Bo-Wei, Wen Ji, Rho Seungmin Neurocomputing . 2016 ,第JANa22PTaA期

机译：大数据的分而治之信号处理，特征提取和机器学习
3. Learning to divide and conquer: applying the L~* algorithm to automate assume-guarantee reasoning [J] . Corina S. Pasareanu, Dimitra Giannakopoulou, Mihaela Gheorghiu Bobaru, Formal Methods in System Design . 2008 ,第3期

机译：学习分而治之：应用L〜*算法自动进行假设保证推理
4. Simple learning algorithms using divide and conquer [C] . Nader H. Bshouty Conference on Computational learning theory;Annual conference on Computational learning theory . 1995

机译：使用分而治之的简单学习算法
5. Parametric Multi-Way Recursive Divide-and-Conquer Algorithms for Dynamic Programs [D] . Javanmard, Mohammad Mahdi. 2020

机译：动态程序的参数多向递归划分和征管算法
6. A divide-and-conquer algorithm for large-scale de novo transcriptome assembly through combining small assemblies from existing algorithms [O] . Sing-Hoi Sze, Jonathan J. Parrott, Aaron M. Tarone 2017

机译：通过结合现有算法中的小程序集进行大规模从头转录组组装的分治法
7. Simple Learning Algorithms Using Divide and Conquer [O] . Nader H. Bshouty 1995

机译：使用分而治之的简单学习算法

Divide and Conquer Algorithms for Machine Learning

摘要

著录项

相似文献

相关主题

期刊订阅