FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks

Blott Michaela; Preusser Thomas B.; Fraser Nicholas J.; Gambardella Giulio; OBrien Kenneth; Umuroglu Yaman; Leeser Miriam; Vissers Kees

首页> 外文期刊>ACM transactions on reconfigurable technology and systems >FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks

【24h】

FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks

机译：FINN-R：快速探索量化神经网络的端到端深度学习框架

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Convolutional Neural Networks have rapidly become the most successful machine-learning algorithm, enabling ubiquitous machine vision and intelligent decisions on even embedded computing systems. While the underlying arithmetic is structurally simple, compute and memory requirements are challenging. One of the promising opportunities is leveraging reduced-precision representations for inputs, activations, and model parameters. The resulting scalability in performance, power efficiency, and storage footprint provides interesting design compromises in exchange for a small reduction in accuracy. FPGAs are ideal for exploiting low-precision inference engines leveraging custom precisions to achieve the required numerical accuracy for a given application. In this article, we describe the second generation of the FINN framework, an end-to-end tool that enables design-space exploration and automates the creation of fully customized inference engines on FPGAs. Given a neural network description, the tool optimizes for given platforms, design targets, and a specific precision. We introduce formalizations of resource cost functions and performance predictions and elaborate on the optimization algorithms. Finally, we evaluate a selection of reduced precision neural networks ranging from CIFAR-10 classifiers to YOLO-based object detection on a range of platforms including PYNQ and AWS F1, demonstrating new unprecedented measured throughput at 50 TOp/s on AWS F1 and 5 TOp/s on embedded devices.

机译：卷积神经网络已迅速成为最成功的机器学习算法，甚至在嵌入式计算系统上也可以实现无处不在的机器视觉和智能决策。尽管底层算法在结构上很简单，但计算和内存需求却充满挑战。一种有前途的机会是利用输入，激活和模型参数的简化表示形式。由此产生的性能，电源效率和存储空间的可扩展性提供了有趣的设计折衷，以换取精度的少量降低。 FPGA是利用定制精度来开发给定应用所需的数值精度的低精度推理引擎的理想选择。在本文中，我们描述了第二代FINN框架，这是一种端到端工具，可用于设计空间探索并自动在FPGA上创建完全定制的推理引擎。给定一个神经网络描述，该工具将针对给定的平台，设计目标和特定的精度进行优化。我们介绍了资源成本函数和性能预测的形式化，并详细说明了优化算法。最后，我们在包括PYNQ和AWS F1在内的一系列平台上评估了从CIFAR-10分类器到基于YOLO的对象检测等一系列降低精度的神经网络，展示了AWS F1和5 TOp上以50 TOp / s的新空前测量吞吐量/ s在嵌入式设备上。

著录项

来源
《ACM transactions on reconfigurable technology and systems》 |2018年第3期|16.1-16.23|共23页
作者
Blott Michaela; Preusser Thomas B.; Fraser Nicholas J.; Gambardella Giulio; OBrien Kenneth; Umuroglu Yaman; Leeser Miriam; Vissers Kees;
展开▼
作者单位

Xilinx Res, Xilinx Res Labs, 2020 Bianconi Ave,Citywest Business Campus, Dublin 24, Ireland;

Xilinx Res, Xilinx Res Labs, 2020 Bianconi Ave,Citywest Business Campus, Dublin 24, Ireland;

Xilinx Res, Xilinx Res Labs, 2020 Bianconi Ave,Citywest Business Campus, Dublin 24, Ireland;

Xilinx Res, Xilinx Res Labs, 2020 Bianconi Ave,Citywest Business Campus, Dublin 24, Ireland;

Xilinx Res, Xilinx Res Labs, 2020 Bianconi Ave,Citywest Business Campus, Dublin 24, Ireland;

Xilinx Res, Xilinx Res Labs, 2020 Bianconi Ave,Citywest Business Campus, Dublin 24, Ireland;

Northeastern Univ, 316 Dana Res Ctr,360 Huntington Ave, Boston, MA 02115 USA;

Xilinx Res, 2100 Log Dr, San Jose, CA 95124 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Neural network; artificial intelligence; FPGA; quantized neural networks; convolutional neural networks; FINN; inference; hardware accellerator;

机译：神经网络;人工智能;FPGA;量化神经网络;卷积神经网络;FINN;推理;硬件加速器;

相似文献

外文文献
中文文献
专利

1. A Multi-Task Framework for Facial Attributes Classification through End-to-End Face Parsing and Deep Convolutional Neural Networks [J] . Nature reviews Cancer . 2020,第2期

机译：通过端到端脸部解析和深卷积神经网络分类的面部属性的多任务框架
2. An End-to-End Compression Framework Based on Convolutional Neural Networks [J] . Feng Jiang, Wen Tao, Shaohui Liu, IEEE Transactions on Circuits and Systems for Video Technology . 2018,第10期

机译：基于卷积神经网络的端到端压缩框架
3. A Learning Framework for n-Bit Quantized Neural Networks Toward FPGAs [J] . Chen Jun, Liu Liang, Liu Yong, Neural Networks and Learning Systems, IEEE Transactions on . 2021,第3期

机译：对FPGA的N位量化神经网络的学习框架
4. End-to-End Automation Frameworks for Mapping Neural Networks onto Embedded Devices and Early Performance Predictions: A Survey [C] . Yannick Braatz, Michael J. Klaiber Smart Systems Integration Conference . 2021

机译：用于将神经网络映射到嵌入式设备和早期性能预测的端到端自动化框架：调查
5. Fast Algorithm For Quantized Convolutional Neural Networks [D] . Pappalardo, Alessandro. 2017

机译：用于量化卷积神经网络的快速算法
6. A Multi-Task Framework for Facial Attributes Classification through End-to-End Face Parsing and Deep Convolutional Neural Networks [O] . Khalil Khan, Muhammad Attique, Rehan Ullah Khan, 2020

机译：通过端到端脸部分析和深度卷积神经网络进行面部属性分类的多任务框架
7. Integrating Neural Networks Into the Blind Deblurring Framework to Compete With the End-to-End Learning-Based Methods [O] . Junde Wu, Xiaoguang Di 2020

机译：将神经网络集成到盲假设框架中以与基于端到端的基于学习的方法竞争

FINN-R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅