Using of Bfloat16 Format in Deep Learning Embedded Accelerators based on FPGA with Limited Quantity of Dedicated Multipliers

机译：基于FPGA的深度学习嵌入式加速器中的Bfloat16格式，具有限量专用乘法器

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The hardware base of Deep Learning Neural Network (DLNN) realization methods are remote cloud services, Graphical Processing Units (GPU) and Field Programmable Gate Arrays (FPGA). The one of the main differences between FPGA devices is important for DLNN realization is quantity of dedicated multipliers in DSP blocks. In this article a method for optimization based on bfloat16 data format useful for FPGA devices with small quantities of DSP blocks is described.

机译：深度学习神经网络（DLNN）实现方法的硬件基础是远程云服务，图形处理单元（GPU）和现场可编程门阵列（FPGA）。 FPGA设备之间的主要差异对于DLNN实现是重要的，这是DSP块中专用乘法器的数量。在本文中，描述了一种基于对具有少量DSP块的FPGA设备的BFLOAT16数据格式进行优化方法。

著录项

来源
《National Conference with International Participation》|2020年|82-85|共4页
会议地点
作者
Boyko B. Petrov;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Field programmable gate arrays; Neurons; Deep learning; Clocks; Data transfer; Adders; Calculators;

机译：现场可编程门阵列;神经元;深入学习;时钟;数据传输;加法器;计算器;

相似文献

外文文献
中文文献
专利

1. Processing Grid-format Real-world Graphs on DRAM-based FPGA Accelerators with Application-specific Caching Mechanisms [J] . Shao Zhiyuan, Liu Chenhao, Li Ruoshi, ACM transactions on reconfigurable technology and systems . 2020,第3期

机译：处理基于DRAM的FPGA加速器网格格式实际图表，具有特定于应用的缓存机制
2. Heterogeneous Computing Utilizing FPGAs: A New and Flexible Approach Integrating Dedicated Hardware Accelerators into Common Computing Platforms [J] . Reichenbach Marc, Holzinger Philipp, Haeublein Konrad, Journal of signal processing systems for signal, image, and video technology . 2019,第7期

机译：利用FPGA的异构计算：一种将专用硬件加速器集成到通用计算平台中的灵活新方法
3. Sensitivity-Based Error Resilient Techniques With Heterogeneous Multiply–Accumulate Unit for Voltage Scalable Deep Neural Network Accelerators [J] . Shin Dongyeob, Choi Wonseok, Park Jongsun, Emerging and Selected Topics in Circuits and Systems, IEEE Journal on . 2019,第3期

机译：电压可扩展深度神经网络加速器的基于异构的乘积累加单元的基于灵敏度的误差恢复技术
4. Evaluating Embedded FPGA Accelerators for Deep Learning Applications [C] . Gopalakrishna Hegde, Siddhartha, Nachiappan Ramasamy, IEEE International Symposium on Field-Programmable Custom Computing Machines . 2016

机译：评估嵌入式FPGA加速器进行深度学习应用
5. FPGA-based Accelerators for Convolutional Neural Networks on Embedded Devices [D] . Perera Miro, Jordi. 2020

机译：基于FPGA的嵌入式设备卷积神经网络的加速器
6. Families of FPGA-Based Accelerators for Approximate String Matching [O] . Tom Van Court, Martin C. Herbordt -1

机译：基于FPGA的加速器家族用于近似字符串匹配
7. Design of OpenCL-compatible multithreaded hardware accelerators with dynamic support for embedded FPGAs [O] . Rodríguez Medina Alfonso, Valverde Alcalá Juan, Torre Arnanz Eduardo de la 2015

机译：动态支持嵌入式FPGA的OpenCL兼容多线程硬件加速器设计

Using of Bfloat16 Format in Deep Learning Embedded Accelerators based on FPGA with Limited Quantity of Dedicated Multipliers

摘要

著录项

相似文献

相关主题

期刊订阅