End-to-end hardware accelerator for deep convolutional neural network

机译：深度卷积神经网络的端到端硬件加速器

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Deep convolutional neural networks (CNNs) have achieved state-of-the-art accuracy on recognition, detection, and other computer vision fields. A CNN hardware will enable mobile devices to meet real time demands. However, the design of CNN hardware faces challenges of high computational complexity and data bandwidth as well as huge divergence for different CNN network layers. In which, the throughput of the convolutional layer would be bounded by hardware resource and throughput of the fully connected layer would be bounded by available data bandwidth. Thus, a highly flexible design with efficient hardware is desired to meet these needs. This talk will present our end-to-end CNN accelerator with shared filter kernel for all layers and output view strategy for maximum data reuse. The whole CNN architecture is modelled with tile based design to optimize hardware resource and I/O data bandwidth for the desired CNN network under design constraints. The final design is generated based on desired resources and run time reconfigured by layer optimized parameters to achieve real time end-to-end CNN acceleration.

机译：深度卷积神经网络（CNN）在识别，检测和其他计算机视觉领域已经达到了最新的准确性。 CNN硬件将使移动设备能够满足实时需求。然而，CNN硬件的设计面临着高计算复杂度和数据带宽以及不同CNN网络层之间巨大差异的挑战。其中，卷积层的吞吐量将受到硬件资源的限制，而全连接层的吞吐量将受到可用数据带宽的限制。因此，需要具有高效硬件的高度灵活的设计来满足这些需求。本讲座将介绍我们的端到端CNN加速器，该共享器具有适用于所有层的共享过滤器内核以及用于最大程度地重复使用数据的输出视图策略。整个CNN架构均采用基于图块的设计进行建模，以在设计约束下针对所需的CNN网络优化硬件资源和I / O数据带宽。最终设计基于所需的资源生成，并且运行时间由层优化参数重新配置，以实现实时的端到端CNN加速。

著录项

来源
《International Symposium on VLSI Design, Automation, and Test》|2018年|1-1|共1页
会议地点
作者
Tian-Sheuan Chang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Low power & mobile hardware accelerators for deep convolutional neural networks [J] . Scanlan Anthony G. Integration . 2019,第MARa期

机译：用于深度卷积神经网络的低功耗和移动硬件加速器
2. Low power & mobile hardware accelerators for deep convolutional neural networks [J] . Scanlan Anthony G. Integration . 2019,第Mara期

机译：低功耗和移动硬件加速器，用于深卷积神经网络
3. TileNET: Hardware accelerator for ternary Convolutional Neural Networks [J] . Eetha Sagar, Sruthi P. K., Pant Vibha, Microprocessors and microsystems . 2021,第Juna期

机译：Tilenet：三元卷积神经网络的硬件加速器
4. End-to-end hardware accelerator for deep convolutional neural network [C] . Tian-Sheuan Chang International Symposium on VLSI Design, Automation, and Test . 2018

机译：深度卷积神经网络的端到端硬件加速器
5. Hardware Acceleration of Deep Convolutional Neural Networks on FPGA [D] . Ma, Yufei. 2018

机译：FPGA上的深度卷积神经网络的硬件加速
6. A Multi-Task Framework for Facial Attributes Classification through End-to-End Face Parsing and Deep Convolutional Neural Networks [O] . Khalil Khan, Muhammad Attique, Rehan Ullah Khan, 2020

机译：通过端到端脸部分析和深度卷积神经网络进行面部属性分类的多任务框架
7. A Multi-Task Framework for Facial Attributes Classification through End-to-End Face Parsing and Deep Convolutional Neural Networks [O] . Khalil Khan, Muhammad Attique, Rehan Ullah Khan, 2020

机译：通过端到端脸部解析和深卷积神经网络分类的面部属性的多任务框架

End-to-end hardware accelerator for deep convolutional neural network

摘要

著录项

相似文献

相关主题

期刊订阅