Semantics-Preserving Parallelization of Stochastic Gradient Descent

机译：随机梯度下降的保留语义的并行化

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Stochastic gradient descent (SGD) is a well-known method for regression and classification tasks. However, it is an inherently sequential algorithm - at each step, the processing of the current example depends on the parameters learned from previous examples. Prior approaches to parallelizing linear learners using SGD, such as Hogwild! and AllReduce, do not honor these dependencies across threads and thus can potentially suffer poor convergence rates and/or poor scalability. This paper proposes SymSGD, a parallel SGD algorithm that, to a first-order approximation, retains the sequential semantics of SGD. Each thread learns a local model in addition to a model combiner, which allows local models to be combined to produce the same result as what a sequential SGD would have produced. This paper evaluates SymSGD's accuracy and performance on 6 datasets on a shared-memory machine shows up-to 11x speedup over our heavily optimized sequential baseline on 16 cores and 2.2x, on average, faster than Hogwild!.

机译：随机梯度下降（SGD）是用于回归和分类任务的众所周知的方法。但是，它是一种固有的顺序算法-在每个步骤中，当前示例的处理都取决于从先前示例中学到的参数。使用SGD并行化线性学习器的现有方法，例如Hogwild！和AllReduce，不遵守线程间的这些依赖关系，因此可能会导致收敛速度较差和/或可伸缩性较差。本文提出了SymSGD，这是一种并行SGD算法，对于一阶近似值，它保留了SGD的顺序语义。除了模型组合器之外，每个线程还学习局部模型，该模型组合器允许组合局部模型以产生与顺序SGD产生的结果相同的结果。本文对共享内存计算机上6个数据集的SymSGD的准确性和性能进行了评估，结果表明，与我们在16个内核上进行了高度优化的顺序基准相比，SymSGD的速度提高了11倍，平均速度比Hogwild！快了2.2倍。

著录项

来源
《IEEE International Parallel and Distributed Processing Symposium》|2018年|224-233|共10页
会议地点
作者
Saeed Maleki; Madanlal Musuvathi; Todd Mytkowicz;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Computational modeling; Mathematical model; Stochastic processes; Machine learning; Machine learning algorithms; Convergence; Semantics;

机译：计算建模;数学模型;随机过程;机器学习;机器学习算法;收敛性;语义;

相似文献

外文文献
中文文献
专利

1. Efficient and High-quality Recommendations via Momentum-incorporated Parallel Stochastic Gradient Descent-Based Learning [J] . Xin Luo, Wen Qin, Ani Dong, Automatica Sinica, IEEE/CAA Journal of . 2021,第2期

机译：通过势头掺入的并行随机梯度下降学习的高效和高质量建议
2. Efficient and High-quality Recommendations via Momentum-incorporated Parallel Stochastic Gradient Descent-Based Learning [J] . Xin Luo, Wen Qin, Ani Dong, 自动化学报：英文版 . 2021,第002期

机译：通过势头掺入的并行随机梯度下降学习的高效和高质量建议
3. Efficient and High-quality Recommendations via Momentum-incorporated Parallel Stochastic Gradient Descent-Based Learning [J] . Xin Luo, Wen Qin, Ani Dong, 自动化学报（英文版） . 2021,第002期

机译：通过势头掺入的并行随机梯度下降学习的高效和高质量建议
4. Semantics-Preserving Parallelization of Stochastic Gradient Descent [C] . Saeed Maleki, Madanlal Musuvathi, Todd Mytkowicz IEEE International Parallel and Distributed Processing Symposium . 2018

机译：随机梯度下降的语义保留并行化
5. Efficient and Scalable Parallel Stochastic Gradient Descent on a Heterogeneous CPU-FPGA platform for Large Scale Machine Learning [D] . Rasoori, Sandeep. 2017

机译：用于大规模机器学习的异构CPU-FPGA平台上高效且可伸缩的平行随机梯度下降
6. Parameter inference for discretely observed stochastic kinetic models using stochastic gradient descent [O] . Yuanfeng Wang, Scott Christley, Eric Mjolsness, 2010

机译：使用随机梯度下降法的离散观测随机动力学模型的参数推断
7. Consistent Lock-free Parallel Stochastic Gradient Descent for Fast and Stable Convergence [O] . Karl Backstrom, Ivan Walulya, Marina Papatriantafilou, 2021

机译：一致的锁定平行随机梯度下降，用于快速且稳定的收敛

Semantics-Preserving Parallelization of Stochastic Gradient Descent

摘要

著录项

相似文献

相关主题

期刊订阅