首页> 外文会议>IEEE International Symposium on Circuits and Systems >Hardware accelerators for recurrent neural networks on FPGA
【24h】

Hardware accelerators for recurrent neural networks on FPGA

机译:FPGA上经常性神经网络的硬件加速器

获取原文

摘要

Recurrent Neural Networks (RNNs) have the ability to retain memory and learn from data sequences, which are fundamental for real-time applications. RNN computations offer limited data reuse, which leads to high data traffic. This translates into high off-chip memory bandwidth or large internal storage requirement to achieve high performance. Exploiting parallelism in RNN computations are bounded by this two limiting factors, among other constraints present in embedded systems. Therefore, balance between internally stored data and off-chip memory data transfer is necessary to overlap computation time with data transfer latency. In this paper, we present three hardware accelerators for RNN on Xilinx's Zynq SoC FPGA to present how to overcome challenges involved in developing RNN accelerators. Each design uses different strategies to achieve high performance and scalability. Each co-processor was tested with a character level language model. The latest design called DeepRnn, achieves up to 23 X better performance per power than Tegra X1 development board for this application.
机译:经常性的神经网络(RNNS)具有保留存储器并从数据序列中学习的能力,这是实时应用的基础。 RNN计算提供有限的数据重用,这导致高数据流量。这转化为高性芯片内存带宽或大型内部存储要求,以实现高性能。在嵌入式系统中存在的其他限制因素中,在RNN计算中利用并行性界定。因此,内部存储的数据和片外存储器数据传输之间的平衡是与数据传输延迟重叠的计算时间。在本文中,我们为Xilinx Zynq SoC FPGA的RNN提供了三个硬件加速器,以呈现如何克服发展RNN加速器所涉及的挑战。每个设计都使用不同的策略来实现高性能和可扩展性。每个协处理器都以字符级语言模型进行测试。最新的设计名为DeePrnn,比Tegra X1开发板实现高达23倍的性能,而是该应用程序的开发板。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号