首页> 外文会议>IEEE International Symposium on Information Theory >Straggler-Proofing Massive-Scale Distributed Matrix Multiplication with D-Dimensional Product Codes
【24h】

Straggler-Proofing Massive-Scale Distributed Matrix Multiplication with D-Dimensional Product Codes

机译:具有D维乘积码的防散乱的大规模分布式矩阵乘法

获取原文

摘要

Distributed computing allows for large-scale computation and machine learning tasks by enabling parallel computing at massive scale. A critical challenge to speeding up distributed computing comes from stragglers, a crippling bottleneck to system performance [1]. Recently, coding theory has offered an attractive paradigm dubbed as coded computation [2] for addressing this challenge through the judicious introduction of redundant computing to combat stragglers. However, most existing approaches have limited applicability if the system scales to hundreds or thousands of workers, as is the trend in computing platforms. At these scales, previously proposed algorithms based on Maximum Distance Separable (MDS) codes are too expensive due to their hidden cost, i.e., computing and communication costs associated with the encoding/decoding procedures. Motivated by this limitation, we present a novel coded matrix-matrix multiplication scheme based on d-dimensional product codes. We show that our scheme allows for order-optimal computation/communication costs for the encoding/decoding procedures while achieving near-optimal compute time.
机译:分布式计算通过实现大规模并行计算,可以进行大规模计算和机器学习任务。加速分布式计算的一个关键挑战来自散乱的人,这是系统性能的一个严重瓶颈[1]。最近,编码理论提供了一种有吸引力的范式,称为编码计算[2],它通过明智地引入冗余计算来对抗散乱者来解决这一挑战。但是,如果系统可以扩展到成百上千的工作人员,那么大多数现有方法的适用性就会受到限制,这是计算平台的趋势。在这些规模上,先前提出的基于最大距离可分离(MDS)码的算法由于其隐藏成本(即,与编码/解码过程相关的计算和通信成本)而过于昂贵。受此限制的驱使,我们提出了一种基于d维乘积码的新颖编码矩阵矩阵乘法方案。我们表明,我们的方案允许在实现接近最佳计算时间的同时,对编码/解码过程进行阶次最佳计算/通信成本。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号