FeatherNet: An Accelerated Convolutional Neural Network Design for Resource-constrained FPGAs

Morcel Raghid; Hajj Hazem M.; Saghir Mazen A. R.; Akkary Haitham; Artail Hassan; Khanna Rahul; Keshavamurthy Anil

首页> 外文期刊>ACM transactions on reconfigurable technology and systems >FeatherNet: An Accelerated Convolutional Neural Network Design for Resource-constrained FPGAs

【24h】

FeatherNet: An Accelerated Convolutional Neural Network Design for Resource-constrained FPGAs

机译：FeatherNet：针对资源受限的FPGA的加速卷积神经网络设计

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Convolutional Neural Network (ConvNet or CNN) algorithms are characterized by a large number of model parameters and high computational complexity. These two requirements have made it challenging for implementations on resource-limited FPGAs. The challenges are magnified when considering designs for low-end FPGAs. While previous work has demonstrated successful ConvNet implementations with high-end FPGAs, this article presents a ConvNet accelerator design that enables the implementation of complex deep ConvNet architectures on resource-constrained FPGA platforms aimed at the IoT market. We call the design "FeatherNet" for its light resource utilization. The implementations are VHDL-based providing flexibility in design optimizations. As part of the design process, newmethods are introduced to address several design challenges. The first method is a novel stride-aware graph-based method targeted at ConvNets that aims at achieving efficient signal processing with reduced resource utilization. The second method addresses the challenge of determining the minimal precision arithmetic needed while preserving high accuracy. For this challenge, we propose variable-width dynamic fixed-point representations combined with a layer-by-layer design-space pruning heuristic across the different layers of the deep ConvNet model. The third method aims at achieving a modular design that can support different types of ConvNet layers while ensuring low resource utilization. For this challenge, we propose the modules to be relatively small and composed of computational filters that can be interconnected to build an entire accelerator design. These model elements can be easily configured through HDL parameters (e.g., layer type, mask size, stride, etc.) to meet the needs of specific ConvNet implementations and thus they can be reused to implement a wide variety of ConvNet architectures. The fourth method addresses the challenge of design portability between two different FPGA vendor platforms, namely, Intel/Altera and Xilinx. For this challenge, we propose to instantiate the device-specific hardware blocks needed in each computational filter, rather than relying on the synthesis tools to infer these blocks, while keeping track of the similarities and differences between the two platforms. We believe that the solutions to these design challenges further advance knowledge as they can benefit designers and other researchers using similar devices or facing similar challenges. Our results demonstrated the success of addressing the design challenges and achieving low (30%) resource utilization for the low-end FPGA platforms: Zedboard and Cyclone V. The design overcame the limitation of designs targeted for high-end platforms and that cannot fit on low-end IoT platforms. Furthermore, our design showed superior performance results (measured in terms of [Frame/s/W] per Dollar) compared to high-end optimized designs.

机译：卷积神经网络（ConvNet或CNN）算法的特征在于大量的模型参数和高计算复杂性。这两个要求使得在资源受限的FPGA上的实现具有挑战性。考虑低端FPGA设计时，挑战变得更大。尽管先前的工作已经证明了使用高端FPGA成功实现ConvNet的实现，但本文介绍了一种ConvNet加速器设计，该设计可以在针对物联网市场的资源受限的FPGA平台上实现复杂的深度ConvNet架构。我们将其设计称为“ FeatherNet”，以利用其轻型资源。这些实现是基于VHDL的，在设计优化中提供了灵活性。作为设计过程的一部分，引入了新方法来应对一些设计挑战。第一种方法是针对ConvNets的新颖的基于步幅感知图的方法，旨在通过减少资源利用来实现有效的信号处理。第二种方法解决了在保持高精度的同时确定所需的最小精度算法的挑战。针对这一挑战，我们提出了可变宽度动态定点表示形式，并结合了深度ConvNet模型不同层的逐层设计空间修剪启发法。第三种方法旨在实现一种模块化设计，该设计可以支持不同类型的ConvNet层，同时确保较低的资源利用率。针对这一挑战，我们建议模块相对较小，并由可互连以构建整个加速器设计的计算滤波器组成。可以通过HDL参数（例如，层类型，掩码大小，步幅等）轻松配置这些模型元素，以满足特定ConvNet实现的需求，因此可以重复使用它们以实现各种ConvNet体系结构。第四种方法解决了在两个不同的FPGA供应商平台（即Intel / Altera和Xilinx）之间进行设计可移植性的挑战。针对这一挑战，我们建议实例化每个计算过滤器所需的特定于设备的硬件模块，而不是依靠综合工具来推断这些模块，同时跟踪两个平台之间的异同。我们认为，针对这些设计挑战的解决方案可以进一步提高知识水平，因为它们可以使使用类似设备或面临类似挑战的设计师和其他研究人员受益。我们的结果表明，成功解决了低端FPGA平台Zedboard和Cyclone V的设计难题并实现了低（30％）的资源利用率。该设计克服了针对高端平台的设计的局限性，并且无法满足以下要求低端物联网平台。此外，与高端优化设计相比，我们的设计显示出出众的性能结果（以每美元[Frame / s / W]衡量）。

著录项

来源
《ACM transactions on reconfigurable technology and systems》 |2019年第2期|6.1-6.27|共27页
作者
Morcel Raghid; Hajj Hazem M.; Saghir Mazen A. R.; Akkary Haitham; Artail Hassan; Khanna Rahul; Keshavamurthy Anil;
展开▼
作者单位

Amer Univ Beirut POB 11-0236 Beirut 11072020 Lebanon;

Intel Corp Hillsboro OR USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Convolutional neural networks; embedded-vision; IoT applications; resource-constrained FPGAs;

机译：卷积神经网络嵌入式视觉物联网应用;资源受限的FPGA;

相似文献

外文文献
中文文献
专利

1. FeatherNet: an accelerated convolutional neural network design for resource-constrained FPGAs [J] . David B. Henderson Computing reviews . 2019,第10期

机译：FeatherNet：用于资源受限FPGA的加速卷积神经网络设计
2. FPGA-accelerated deep convolutional neural networks for high throughput and energy efficiency [J] . Yuran Qiao, Junzhong Shen, Tao Xiao, Concurrency and Computation . 2017,第20期

机译：FPGA加速的深度卷积神经网络，可实现高吞吐量和能效
3. Efficient Design of Pruned Convolutional Neural Networks on FPGA [J] . Vestias Mario Journal of signal processing systems for signal, image, and video technology . 2021,第5期

机译：FPGA修剪卷积神经网络的高效设计
4. FeatherNets: Convolutional Neural Networks as Light as Feather for Face Anti-Spoofing [C] . Peng Zhang, Fuhao Zou, Zhiwen Wu, IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops . 2019

机译：FeatherNets：像羽毛一样轻巧的卷积神经网络用于面部反欺骗
5. Caffeinated FPGAs: FPGA Framework for Training and Inference of Convolutional Neural Networks With Reduced Precision Floating-Point Arithmetic [D] . DiCecco, Roberto. 2018

机译：含咖啡因的FPGA：用于训练和推理卷积神经网络的FPGA框架，具有降低的精度浮点算法
6. SoC FPGA Accelerated Sub-Optimized Binary Fully Convolutional Neural Network for Robotic Floor Region Segmentation [O] . Chi-Chia Sun, Afaroj Ahamad, Pin-He Liu 2020

机译：SOC FPGA加速子优化二元完全卷积神经网络用于机器人楼层区分割
7. FeatherNets: Convolutional Neural Networks as Light as Feather for Face Anti-Spoofing [O] . Peng Zhang, Fuhao Zou, Zhiwen Wu, 2019

机译：Feathernets：卷积神经网络作为脸部羽毛的羽毛

FeatherNet: An Accelerated Convolutional Neural Network Design for Resource-constrained FPGAs

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅