Straggler-Proofing Massive-Scale Distributed Matrix Multiplication with D-Dimensional Product Codes

机译：具有D维乘积码的防散乱的大规模分布式矩阵乘法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Distributed computing allows for large-scale computation and machine learning tasks by enabling parallel computing at massive scale. A critical challenge to speeding up distributed computing comes from stragglers, a crippling bottleneck to system performance [1]. Recently, coding theory has offered an attractive paradigm dubbed as coded computation [2] for addressing this challenge through the judicious introduction of redundant computing to combat stragglers. However, most existing approaches have limited applicability if the system scales to hundreds or thousands of workers, as is the trend in computing platforms. At these scales, previously proposed algorithms based on Maximum Distance Separable (MDS) codes are too expensive due to their hidden cost, i.e., computing and communication costs associated with the encoding/decoding procedures. Motivated by this limitation, we present a novel coded matrix-matrix multiplication scheme based on d-dimensional product codes. We show that our scheme allows for order-optimal computation/communication costs for the encoding/decoding procedures while achieving near-optimal compute time.

机译：分布式计算通过实现大规模并行计算，可以进行大规模计算和机器学习任务。加速分布式计算的一个关键挑战来自散乱的人，这是系统性能的一个严重瓶颈[1]。最近，编码理论提供了一种有吸引力的范式，称为编码计算[2]，它通过明智地引入冗余计算来对抗散乱者来解决这一挑战。但是，如果系统可以扩展到成百上千的工作人员，那么大多数现有方法的适用性就会受到限制，这是计算平台的趋势。在这些规模上，先前提出的基于最大距离可分离（MDS）码的算法由于其隐藏成本（即，与编码/解码过程相关的计算和通信成本）而过于昂贵。受此限制的驱使，我们提出了一种基于d维乘积码的新颖编码矩阵矩阵乘法方案。我们表明，我们的方案允许在实现接近最佳计算时间的同时，对编码/解码过程进行阶次最佳计算/通信成本。

著录项

来源
《IEEE International Symposium on Information Theory》|2018年|1993-1997|共5页
会议地点
作者
Tavor Baharav; Kangwook Lee; Orhan Ocal; Kannan Ramchandran;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Product codes; Encoding; Decoding; Task analysis; Distributed computing; Two dimensional displays; Toy manufacturing industry;

机译：产品代码;编码;解码;任务分析;分布式计算;二维显示;玩具制造业;

相似文献

外文文献
中文文献
专利

1. Coded Computing for Resilient, Secure, and Privacy-Preserving Distributed Matrix Multiplication [J] . Yu Qian, Avestimehr A. Salman IEEE Transactions on Communications . 2021,第1期

机译：用于弹性，安全和隐私保留分布式矩阵乘法的编码计算
2. Coded Computing and Cooperative Transmission for Wireless Distributed Matrix Multiplication [J] . Li Kuikui, Tao Meixia, Zhang Jingjing, IEEE Transactions on Communications . 2021,第4期

机译：用于无线分布式矩阵乘法的编码计算和协作传输
3. GASP Codes for Secure Distributed Matrix Multiplication [J] . DOliveira Rafael G. L., El Rouayheb Salim, Karpuk David IEEE Transactions on Information Theory . 2020,第7期

机译：用于安全分布式矩阵乘法的GASP码
4. Straggler-Proofing Massive-Scale Distributed Matrix Multiplication with D-Dimensional Product Codes [C] . Tavor Baharav, Kangwook Lee, Orhan Ocal, IEEE International Symposium on Information Theory . 2018

机译：用D维产品代码塑造施用巨型分布式矩阵乘法
5. Flexible Cross-Subspace Alignment Codes for Variable Coded Distributed Batch Matrix Multiplication [D] . Tauz, Lev. 2020

机译：用于可变编码分布式批量矩阵乘法的灵活的跨子空间对齐码
6. Massive-Scale Binding Free Energy Simulations of HIV Integrase Complexes Using Asynchronous Replica Exchange Framework Implemented on the IBM WCG Distributed Network [O] . Junchao Xia, William Flynn, Emilio Gallicchio, -1

机译：使用IBM WCG分布式网络上实现的异步副本交换框架对HIV整合酶复合物进行大规模绑定自由能模拟
7. Coded Computing and Cooperative Transmission for Wireless Distributed Matrix Multiplication [O] . Kuikui Li, Meixia Tao, Jingjing Zhang, 2021

机译：用于无线分布式矩阵乘法的编码计算和协作传输

Straggler-Proofing Massive-Scale Distributed Matrix Multiplication with D-Dimensional Product Codes

摘要

著录项

相似文献

相关主题

期刊订阅