首页> 外文期刊>Wireless communications & mobile computing >An FPGA-Based Convolutional Neural Network Coprocessor
【24h】

An FPGA-Based Convolutional Neural Network Coprocessor

机译:基于FPGA的卷积神经网络协处理器

获取原文
获取外文期刊封面目录资料

摘要

In this paper, an FPGA-based convolutional neural network coprocessor is proposed. The coprocessor has a 1D convolutional computation unit PE in row stationary (RS) streaming mode and a 3D convolutional computation unit PE chain in pulsating array structure. The coprocessor can flexibly control the number of PE array openings according to the number of output channels of the convolutional layer. In this paper, we design a storage system with multilevel cache, and the global cache uses multiple broadcasts to distribute data to local caches and propose an image segmentation method that is compatible with the hardware architecture. The proposed coprocessor implements the convolutional and pooling layers of the VGG16 neural network model, in which the activation value, weight value, and bias value are quantized using 16-bit fixed-point quantization, with a peak computational performance of 316.0 GOP/s and an average computational performance of 62.54 GOP/s at a clock frequency of 200?MHz and a power consumption of about 9.25?W.
机译:本文提出了一种基于FPGA的卷积神经网络协处理器。协处理器具有由行静止(RS)流模式的1D卷积计算单元PE和脉动阵列结构中的3D卷积计算单元PE链。根据卷积层的输出通道的数量,协处理器可以灵活地控制PE阵列开口的数量。在本文中,我们设计具有多级缓存的存储系统,全局高速缓存使用多个广播将数据分发到本地高速缓存,并提出与硬件架构兼容的图像分段方法。所提出的协处理器实现了VGG16神经网络模型的卷积和汇集层,其中使用16位定点量化量化激活值,权重值和偏置值,峰值计算性能为316.0 GOP / S和平均计算性能为62.54 GOP / S,时钟频率为200?MHz,功耗为约9.25倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号