首页> 外文会议>IEEE International Symposium on Circuits and Systems >Accelerating 3D Convolutional Neural Networks Using 3D Fast Fourier Transform
【24h】

Accelerating 3D Convolutional Neural Networks Using 3D Fast Fourier Transform

机译:使用3D快速傅里叶变换加速3D卷积神经网络

获取原文

摘要

Three-dimensional convolutional neural networks (3D CNNs) have attracted great attention in many complex computer vision tasks. However, it is difficult to deploy 3D CNNs on practical applications due to high algorithmic complexity, imposing the urgent requirement for dedicated accelerators. In this paper, F3D, a fast algorithm for 3D CNNs, is proposed based on 3D Fast Fourier Transform (FFT) and achieves a significant algorithmic strength reduction. We then propose an F3D-based hardware architecture, featuring a flexible FFT module and an efficient partial sum aggregation module. Furthermore, a dataflow for efficient mapping of 3D CNNs is designed, leading to a significant reduction of memory access. To demonstrate the efficiency of the above-mentioned techniques, we implement the widely used 3D CNN model, C3D, as our benchmark on the Xilinx VC709 platform. The experimental result shows that compared with the state-of-the-art accelerator, our work achieves a considerable throughput up to 864.1 GOPs, along with 1.68× and 2.00× efficiency improvement on energy and DSP utilization, respectively.
机译:三维卷积神经网络(3D CNNS)在许多复杂的计算机视觉任务中引起了极大的关注。然而,由于高算法复杂度,难以在实际应用上部署3D CNNS,对专用加速器施加迫切要求。本文基于3D快速傅里叶变换(FFT),提出了一种基于3D快速傅里叶变换(FFT)的F3D,基于3D快速傅里叶变换(FFT),并实现了显着的算法强度降低。然后,我们提出了一种基于F3D的硬件架构,具有灵活的FFT模块和有效的部分和聚合模块。此外,设计了用于高效映射3D CNN的数据流,导致存储器访问的显着降低。为了展示上述技术的效率,我们实现了广泛使用的3D CNN模型C3D,作为Xilinx VC709平台上的基准。实验结果表明,与最先进的加速器相比,我们的工作达到了相当大的吞吐量,高达864.1 GOP,以及1.68×和2.00×效率提高能量和DSP利用率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号