Towards Design Space Exploration and Optimization of Fast Algorithms for Convolutional Neural Networks (CNNs) on FPGAs

机译：FPGA上卷积神经网络（CNN）的设计空间探索和快速算法优化

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Convolutional Neural Networks (CNNs) have gained widespread popularity in the field of computer vision and image processing. Due to huge computational requirements of CNNs, dedicated hardware-based implementations are being explored to improve their performance. Hardware platforms such as Field Programmable Gate Arrays (FPGAs) are widely being used to design parallel architectures for this purpose. In this paper, we analyze Winograd minimal filtering or fast convolution algorithms to reduce the arithmetic complexity of convolutional layers of CNNs. We explore a complex design space to find the sets of parameters that result in improved throughput and power-efficiency. We also design a pipelined and parallel Winograd convolution engine that improves the throughput and power-efficiency while reducing the computational complexity of the overall system. Our proposed designs show up to 4.75× and 1.44× improvements in throughput and power-efficiency, respectively, in comparison to the state-of-the-art design while using approximately 2.67× more multipliers. Furthermore, we obtain savings of up to 53.6% in logic resources compared with the state-of-the-art implementation.

机译：卷积神经网络（CNN）在计算机视觉和图像处理领域获得了广泛的普及。由于CNN的巨大计算需求，正在探索基于硬件的专用实现以改善其性能。为此，广泛使用诸如现场可编程门阵列（FPGA）之类的硬件平台来设计并行体系结构。在本文中，我们分析了Winograd最小滤波或快速卷积算法，以减少CNN卷积层的算术复杂度。我们探索了一个复杂的设计空间，以找到可提高吞吐量和功率效率的参数集。我们还设计了流水线和并行的Winograd卷积引擎，该引擎提高了吞吐量和功率效率，同时降低了整个系统的计算复杂性。与最新设计相比，我们建议的设计在吞吐量和功率效率方面分别提高了4.75倍和1.44倍，同时使用了大约2.67倍的乘法器。此外，与最新的实现方式相比，我们节省了多达53.6％的逻辑资源。

著录项

来源
《Design, Automation and Test in Europe Conference and Exhibition》|2019年|1106-1111|共6页
会议地点
作者
Afzal Ahmad; Muhammad Adeel Pasha;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Complexity theory; Convolution; Transforms; Kernel; Hardware; Throughput; Field programmable gate arrays;

机译：复杂性理论卷积变换内核硬件输出量现场可编程门阵列;

相似文献

外文文献
中文文献
专利

1. Evaluating Fast Algorithms for Convolutional Neural Networks on FPGAs [J] . Liang Yun, Lu Liqiang, Xiao Qingcheng, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems . 2020,第4期

机译：评估FPGA上卷积神经网络的快速算法
2. A Survey of Algorithmic and Hardware Optimization Techniques for Vision Convolutional Neural Networks on FPGAs [J] . Sateesan Arish, Sinha Sharad, Smitha K. G., Neural processing letters . 2021,第3期

机译：对FPGA的视觉卷积神经网络算法和硬件优化技术调查
3. Optimizing OpenCL-Based CNN Design on FPGA with Comprehensive Design Space Exploration and Collaborative Performance Modeling [J] . Mu Jiandong, Zhang Wei, Liang Hao, ACM transactions on reconfigurable technology and systems . 2020,第3期

机译：用综合设计空间探索和协作性能建模优化FPGA的基于Opencl的CNN设计
4. Towards Design Space Exploration and Optimization of Fast Algorithms for Convolutional Neural Networks (CNNs) on FPGAs [C] . Afzal Ahmad, Muhammad Adeel Pasha Design, Automation amp;amp;amp;amp;amp;amp; Test in Europe Conference amp;amp;amp;amp;amp;amp; Exhibition . 2019

机译：朝设计空间探索与优化FPGA的卷积神经网络（CNNS）的快速算法
5. Artificial Neural Network Optimizations for FPGA-Based Accelerators: Exploration of Low Numeric Precision, Sparsity, and Evolutionary Algorithms [D] . Colangelo, Philip . 2020

机译：基于FPGA的促进者的人工神经网络优化：低数字精度，稀疏性和进化算法的探索
6. Automatic Recognition of Holistic Functional Brain Networks Using Iteratively Optimized Convolutional Neural Networks (IO-CNN) with Weak Label Initialization [O] . Yu Zhao, Fangfei Ge, Tianming Liu -1

机译：使用带有弱标签初始化的迭代优化卷积神经网络（IO-CNN）自动识别整体功能性脑网络
7. CNN2Gate: An Implementation of Convolutional Neural Networks Inference on FPGAs with Automated Design Space Exploration [O] . Alireza Ghaffari, Yvon Savaria 2020

机译：CNN2GATE：自动化设计空间探索对FPGA的卷积神经网络推断的实现

Towards Design Space Exploration and Optimization of Fast Algorithms for Convolutional Neural Networks (CNNs) on FPGAs

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅