Hardware accelerators for recurrent neural networks on FPGA

机译：FPGA上经常性神经网络的硬件加速器

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recurrent Neural Networks (RNNs) have the ability to retain memory and learn from data sequences, which are fundamental for real-time applications. RNN computations offer limited data reuse, which leads to high data traffic. This translates into high off-chip memory bandwidth or large internal storage requirement to achieve high performance. Exploiting parallelism in RNN computations are bounded by this two limiting factors, among other constraints present in embedded systems. Therefore, balance between internally stored data and off-chip memory data transfer is necessary to overlap computation time with data transfer latency. In this paper, we present three hardware accelerators for RNN on Xilinx's Zynq SoC FPGA to present how to overcome challenges involved in developing RNN accelerators. Each design uses different strategies to achieve high performance and scalability. Each co-processor was tested with a character level language model. The latest design called DeepRnn, achieves up to 23 X better performance per power than Tegra X1 development board for this application.

机译：经常性的神经网络（RNNS）具有保留存储器并从数据序列中学习的能力，这是实时应用的基础。 RNN计算提供有限的数据重用，这导致高数据流量。这转化为高性芯片内存带宽或大型内部存储要求，以实现高性能。在嵌入式系统中存在的其他限制因素中，在RNN计算中利用并行性界定。因此，内部存储的数据和片外存储器数据传输之间的平衡是与数据传输延迟重叠的计算时间。在本文中，我们为Xilinx Zynq SoC FPGA的RNN提供了三个硬件加速器，以呈现如何克服发展RNN加速器所涉及的挑战。每个设计都使用不同的策略来实现高性能和可扩展性。每个协处理器都以字符级语言模型进行测试。最新的设计名为DeePrnn，比Tegra X1开发板实现高达23倍的性能，而是该应用程序的开发板。

著录项

来源
《IEEE International Symposium on Circuits and Systems》|2017年|1416-2153p|共4页
会议地点
作者
Andre Xian Ming Chang; Eugenio Culurciello;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类电工技术;
关键词

相似文献

外文文献
中文文献
专利

1. An Efficient Hardware Accelerator for Structured Sparse Convolutional Neural Networks on FPGAs [J] . Zhu Chaoyang, Huang Kejie, Yang Shuyuan, IEEE transactions on very large scale integration (VLSI) systems . 2020,第9期

机译：FPGA上结构化稀疏卷积神经网络的有效硬件加速器
2. FPGA Implementations of Feed Forward Neural Network by using Floating Point Hardware Accelerators [J] . Advances in Electrical and Electronic Engineering . 2014,第1期

机译：使用浮点硬件加速器的前馈神经网络的FPGA实现
3. An FPGA-Based Resource-Saving Hardware Accelerator for Deep Neural Network [J] . Han Jia, Xuecheng Zou International Journal of Intelligence Science . 2021,第2期

机译：基于FPGA的深神经网络的资源节约用品加速器
4. Hardware accelerators for recurrent neural networks on FPGA [C] . Andre Xian Ming Chang, Eugenio Culurciello IEEE International Symposium on Circuits and Systems . 2017

机译：FPGA上经常性神经网络的硬件加速器
5. Design of a Scalable, Configurable, and Cluster-based Hierarchical Hardware Accelerator for a Cortically Inspired Algorithm and Recurrent Neural Networks [D] . Dey, Sumon. 2019

机译：设计可扩展，可配置和基于群集的分层硬件加速器，用于显影灵感算法和经常性神经网络
6. Boosting Throughput and Efficiency of Hardware Spiking Neural Accelerators Using Time Compression Supporting Multiple Spike Codes [O] . Changqing Xu, Wenrui Zhang, Yu Liu, 2020

机译：使用时间压缩支撑多穗码的硬件尖峰神经加速器的吞吐量和效率
7. FPGA implementations of feed forward neural network by using floating point hardware accelerators [O] . Lozito Gabriele-Maria, Laudani Antonio, Riganti-Fulginei Francesco, 2014

机译：使用浮点硬件加速器的前馈神经网络的FPGA实现

Hardware accelerators for recurrent neural networks on FPGA

摘要

著录项

相似文献

相关主题

期刊订阅