Semantics-Preserving Parallelization of Stochastic Gradient Descent

机译：随机梯度下降的语义保留并行化

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Stochastic gradient descent (SGD) is a well-known method for regression and classification tasks. However, it is an inherently sequential algorithm - at each step, the processing of the current example depends on the parameters learned from previous examples. Prior approaches to parallelizing linear learners using SGD, such as Hogwild! and AllReduce, do not honor these dependencies across threads and thus can potentially suffer poor convergence rates and/or poor scalability. This paper proposes SymSGD, a parallel SGD algorithm that, to a first-order approximation, retains the sequential semantics of SGD. Each thread learns a local model in addition to a model combiner, which allows local models to be combined to produce the same result as what a sequential SGD would have produced. This paper evaluates SymSGD's accuracy and performance on 6 datasets on a shared-memory machine shows up-to 11x speedup over our heavily optimized sequential baseline on 16 cores and 2.2x, on average, faster than Hogwild!.

机译：随机梯度下降（SGD）是用于回归和分类任务的公知方法。然而，它是固有的顺序算法 - 在每个步骤中，当前示例的处理取决于从前面示例中学到的参数。使用SGD并行化线性学习者的方法，例如Hogwild！并重新征求，不要遵循跨线的这些依赖性，因此可能遭受差的收敛率和/或可扩展性差。本文提出了一个Symsgd，一个并行SGD算法，指向一阶近似，保留了SGD的连续语义。每个线程除了模型组合器之外，每个线程还学习本地模型，这允许将本地模型组合以产生相同的结果，因为所产生的顺序SGD是什么。本文评估了Symsgd在共享内存机器上的6个数据集上的准确性和性能显示，在16个核心和2.2倍上，我们的大量优化的顺序基线高达11倍，平均而言，比Hogwild更快！

著录项

来源
《IEEE International Parallel and Distributed Processing Symposium》|2018年|588p|共10页
会议地点
作者
Saeed Maleki; Madanlal Musuvathi; Todd Mytkowicz;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.133;
关键词
Computational modeling; Mathematical model; Stochastic processes; Machine learning; Machine learning algorithms; Convergence; Semantics;

机译：计算建模;数学模型;随机过程;机器学习;机器学习算法;融合;语义;

相似文献

外文文献
中文文献
专利

1. Efficient and High-quality Recommendations via Momentum-incorporated Parallel Stochastic Gradient Descent-Based Learning [J] . Xin Luo, Wen Qin, Ani Dong, Automatica Sinica, IEEE/CAA Journal of . 2021,第2期

机译：通过势头掺入的并行随机梯度下降学习的高效和高质量建议
2. Efficient and High-quality Recommendations via Momentum-incorporated Parallel Stochastic Gradient Descent-Based Learning [J] . Xin Luo, Wen Qin, Ani Dong, 自动化学报：英文版 . 2021,第002期

机译：通过势头掺入的并行随机梯度下降学习的高效和高质量建议
3. Efficient and High-quality Recommendations via Momentum-incorporated Parallel Stochastic Gradient Descent-Based Learning [J] . Xin Luo, Wen Qin, Ani Dong, 自动化学报（英文版） . 2021,第002期

机译：通过势头掺入的并行随机梯度下降学习的高效和高质量建议
4. Semantics-Preserving Parallelization of Stochastic Gradient Descent [C] . Saeed Maleki, Madanlal Musuvathi, Todd Mytkowicz IEEE International Parallel and Distributed Processing Symposium . 2018

机译：随机梯度下降的保留语义的并行化
5. Efficient and Scalable Parallel Stochastic Gradient Descent on a Heterogeneous CPU-FPGA platform for Large Scale Machine Learning [D] . Rasoori, Sandeep. 2017

机译：用于大规模机器学习的异构CPU-FPGA平台上高效且可伸缩的平行随机梯度下降
6. Parameter inference for discretely observed stochastic kinetic models using stochastic gradient descent [O] . Yuanfeng Wang, Scott Christley, Eric Mjolsness, 2010

机译：使用随机梯度下降法的离散观测随机动力学模型的参数推断
7. Consistent Lock-free Parallel Stochastic Gradient Descent for Fast and Stable Convergence [O] . Karl Backstrom, Ivan Walulya, Marina Papatriantafilou, 2021

机译：一致的锁定平行随机梯度下降，用于快速且稳定的收敛

Semantics-Preserving Parallelization of Stochastic Gradient Descent

摘要

著录项

相似文献

相关主题

期刊订阅