Horizontal division of deep learning applications with all-to-all communication on a multi-FPGA system

机译：多-FPGA系统全面通信的深度学习应用水平分割

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Although convolutional neural networks (CNNs) have plenty of parallelism, traditional layer-by-layer task division designs for multi-FPGA systems have the following problems: (1) The computational load of each layer is different from each other, so the execution time is dominated with the heaviest one. (2) Each FPGA must be designed independently, it means that we must design, generate and manage various configuration files. To address this problem, we propose a horizontal division method that enables us to use of a single design for each FPGA. All layers are divided horizontal direction of the target CNN, and a set of layers is implemented on an FPGA. It reduces the time of design as well as management costs for the execution. Also, since the weight data can be separated, the usage of local memory can be reduced. The apparent disadvantage of this method is that it requires all-to-all data communication between FPGA boards, and so it is not suitable to traditional multi-FPGA systems with a simple linear network. Here, we tried to apply the method to FiC (Flow-in-Cloud) which has a powerful network to enable efficient broadcasting. A simple CNN LeNet and a matrix multiplication for more practical fully connected layer is implemented on the FiC prototype. As a result of the evaluation, LeNet using 8 FP-GAs achieved 7.5 times faster than that with a single FPGA, and achieved 12.6 times faster than the optimized software of a high-end CPU.

机译：虽然卷积神经网络（CNNS）具有充足的并行性，但是多个FPGA系统的传统层面任务划分设计具有以下问题：（1）每层的计算负载彼此不同，因此执行时间以最重的主导地位。（2）每个FPGA必须独立设计，这意味着我们必须设计，生成和管理各种配置文件。为了解决这个问题，我们提出了一种水平分割方法，使我们能够为每个FPGA使用单一设计。所有层都被划分为目标CNN的水平方向，并且在FPGA上实现了一组层。它减少了设计的时间以及执行的管理成本。而且，由于可以分离权重数据，因此可以减少局部存储器的使用。该方法的明显缺点是它需要FPGA板之间的全面数据通信，因此它不适合具有简单线性网络的传统多FPGA系统。在这里，我们尝试将该方法应用于FIC（流入云），该方法具有强大的网络以实现有效的广播。在FIC原型上实现了一个简单的CNN LENET和用于更实际的完全连接层的矩阵乘法。作为评价，使用LeNet的结果8 FP气体实现比快7.5倍与单个FPGA，取得更快12.6倍比高端CPU的优化的软件。

著录项

来源
《International Symposium on Computing and Networking Workshops》|2020年|277-281|共5页
会议地点
作者
Yugo Yamauchi; Akram Ben Ahmed; Kazuei Hironaka; Kensuke Iizuka; Hideharu Amano;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Deep learning; Prototypes; Parallel processing; Software; Data communication; Task analysis; Field programmable gate arrays;

机译：深入学习;原型;并行处理;软件;数据通信;任务分析;现场可编程门阵列;

相似文献

外文文献
中文文献
专利

1. Deep Learning Framework for Wireless Systems: Applications to Optical Wireless Communications [J] . Lee Hoon, Lee Sang Hyun, Quek Tony Q. S., IEEE Communications Magazine . 2019,第3期

机译：无线系统的深度学习框架：在光学无线通信中的应用
2. Deep Learning Framework for Wireless Systems: Applications to Optical Wireless Communications [J] . Lee Hoon, Lee Sang Hyun, Quek Tony Q. S., IEEE Communications Magazine . 2019,第3期

机译：无线系统的深度学习框架：光学无线通信的应用
3. Two Applications of Deep Learning in the Physical Layer of Communication Systems [Lecture Notes] [J] . Emil Bjornson, Pontus Giselsson IEEE Signal Processing Magazine . 2020,第5期

机译：深度学习在通信系统物理层中的两个应用[讲义笔记]
4. Low Latency Ambient Backscatter Communications with Deep Q-Learning for Beyond 5G Applications [C] . Furqan Jameel, Muhammad Ali Jamshed, Zheng Chang, IEEE Vehicular Technology Conference . 2020

机译：低延迟的环境背向散射通信以及超越5G应用的深度Q学习
5. High Performance Computing Applications: Inter-Process Communication, Workflow Optimization, and Deep Learning for Computational Nuclear Physics [D] . Negoita, Gianina Alina. 2018

机译：高性能计算应用程序：用于计算核物理的进程间通信，工作流优化和深度学习
6. Application of Deep Learning System into the Development of Communication Device for Quadriplegic Patient [O] . Jung Hwan Lee, Taewoo Kang, Byung Kwan Choi, 2019

机译：深度学习系统在四肢瘫痪患者通信设备开发中的应用
7. Deep Learning Framework for Wireless Systems: Applications to Optical Wireless Communications [O] . Hoon Lee, Sang Hyun Lee, Tony Q. S. Quek, 2019

机译：无线系统的深度学习框架：光学无线通信的应用
8. Optical-Communication Systems for Deep-Space Applications [R] . Vilnrotter, V. A., Gagliardi, R. M. 1980

机译：用于深空应用的光通信系统

Horizontal division of deep learning applications with all-to-all communication on a multi-FPGA system

摘要

著录项

相似文献

相关主题

期刊订阅