首页> 外文会议>International Conference on Biomedical Engineering and Applications >Configurable N-fold Hardware Architecture for Convolutional Neural Networks
【24h】

Configurable N-fold Hardware Architecture for Convolutional Neural Networks

机译:可配置的N形硬件架构用于卷积神经网络

获取原文

摘要

A Convolutional Neural Network (CNN) is a class of deep feed-forward artificial neural network usually employed to analyze visual images. Recently, rapid progress in applications based on CNNs has urged research on efficient architectures and implementations that exploit the latest technological advancements. The growing complexity of CNN architectures and the usage of reconfigurable devices enlarge the design space to a range hard to fully explore. This paper proposes a new method to design and implement efficient and flexible CNN architectures on hardware. The method adopts an N-fold approach, particularly suitable for devices with strict restrictions on power consumption and featuring reconfigurability. An 8-layer CNN to classify handwritten was trained on a software and prototyped on a FPGA available in a Zynq 7000 Programmable System on a Chip (SoC) board using the proposed CNN architecture. The hardware architecture was described and implemented using High Level Synthesis, enabling a fast development and easy configuration. Experimental results show that the best performance is achieved by using a pipelined design with a partial unfolding sublayer. A processing time of 3.4ms to classify was achieved with 41.16% of resources and 2.1W of power consumption. In contrast, a low-power design <;2W), consuming 27.99% of the resources of the board, required 16.7ms to process the same CNN.
机译:卷积神经网络(CNN)是通常用于分析视觉图像的一类深馈人工神经网络。最近,基于CNN的申请的快速进步促请了利用最新技术进步的有效架构和实现的研究。 CNN架构的增长复杂性和可重构设备的用法将设计空间扩大到难以完全探索的范围。本文提出了一种在硬件上设计和实施高效和灵活的CNN架构的新方法。该方法采用N型折叠方法,特别适用于对功耗严格限制的设备,并具有重新配置性。一个8层CNN用于对手写进行分类,在软件上培训,并在芯片(SOC)板上的Zynq 7000可编程系统中可用的FPGA上的FPGA原型,使用所提出的CNN架构。使用高级合成描述和实现了硬件架构,实现了快速开发和简单配置。实验结果表明,通过使用具有部分展开子层的流水线设计实现了最佳性能。通过41.16 %的资源和功耗为2.1W,实现了3.4ms分类的处理时间。相比之下,低功耗设计<; 2W),消耗27.99 %的电路板资源,需要16.7ms来处理相同的CNN。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号