首页> 外文期刊>Neural Networks and Learning Systems, IEEE Transactions on >LAGC: Lazily Aggregated Gradient Coding for Straggler-Tolerant and Communication-Efficient Distributed Learning
【24h】

LAGC: Lazily Aggregated Gradient Coding for Straggler-Tolerant and Communication-Efficient Distributed Learning

机译:LAGC:懒散的梯度编码,用于锻炼和通信有效的分布式学习

获取原文
获取原文并翻译 | 示例

摘要

Gradient-based distributed learning in parameter server (PS) computing architectures is subject to random delays due to straggling worker nodes and to possible communication bottlenecks between PS and workers. Solutions have been recently proposed to separately address these impairments based on the ideas of gradient coding (GC), worker grouping, and adaptive worker selection. This article provides a unified analysis of these techniques in terms of wall-clock time, communication, and computation complexity measures. Furthermore, in order to combine the benefits of GC and grouping in terms of robustness to stragglers with the communication and computation load gains of adaptive selection, novel strategies, named lazily aggregated GC (LAGC) and grouped-LAG (G-LAG), are introduced. Analysis and results show that G-LAG provides the best wall-clock time and communication performance while maintaining a low computational cost, for two representative distributions of the computing times of the worker nodes.
机译:参数服务器(PS)计算体系结构中的基于梯度的分布式学习受到贸易节点的随机延迟,并且可以在PS和工人之间进行通信瓶颈。最近提出了解决方案以根据梯度编码(GC),工人分组和自适应工作人员选择的思想分别解决这些损伤。本文在挂钟时间,通信和计算复杂度措施方面提供了对这些技术的统一分析。此外,为了将GC和分组在鲁棒性方面的益处与自适应选择的通信和计算负荷增益相结合,新的策略,名为Lazily聚合的GC(LAGC)和分组的滞后(G-LAG)。介绍。分析和结果表明,G-LAG提供了最佳的壁钟时间和通信性能,同时保持了工人节点的计算时间的两个代表性分布的低计算成本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号