首页> 外文会议>IEEE International Conference on Consumer Electronics- Taiwan >High Throughput Hardware Implementation for Deep Learning AI Accelerator
【24h】

High Throughput Hardware Implementation for Deep Learning AI Accelerator

机译:深度学习AI加速器的高吞吐量硬件实现

获取原文

摘要

In this paper, a high-throughput hardware accelerator for deep learning neural networks is proposed. Since deep learning operations require high data access from DRAM, we design a high data reuse architecture to reduce the data access directly from external DRAM and provide a pipeline scheme to achieve high throughput requirements. The architecture proposed in this paper uses INT8 precision computing, 128-bit AXI bus protocol, and parallel processing with 16 sets of processing units, and achieved real-time operation at 125 MHz operating frequency and 8GOPS throughput.
机译:本文提出了一种用于深度学习神经网络的高吞吐量硬件加速器。由于深度学习操作需要从DRAM进行高数据访问,因此我们设计了一种高数据重用架构,以减少直接从外部DRAM进行数据访问,并提供流水线方案来实现高吞吐量要求。本文提出的架构使用INT8精确计算,128位AXI总线协议以及16组处理单元的并行处理,并以125 MHz的工作频率和8GOPS的吞吐量实现了实时操作。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号