Automated Hardware and Neural Network Architecture co-design of FPGA accelerators using multi-objective Neural Architecture Search

机译：使用多目标神经架构搜索的自动化硬件和神经网络架构共设计FPGA加速器

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

State-of-the-art Neural Network Architectures (NNAs) are challenging to design and implement efficiently in hardware. In the past couple of years, this has led to an explosion in research and development of automatic Neural Architecture Search (NAS) tools. AutoML tools are now used to achieve state of the art NNA designs and attempt to optimize for hardware usage and design. Much of the recent research in the auto-design of NNAs has focused on convolution networks and image recognition, ignoring the fact that a significant part of the workload in data centers is general-purpose deep neural networks. In this work, we develop and test a general multilayer perceptron (MLP) flow that can take arbitrary datasets as input and automatically produce optimized NNAs and hardware designs. We test the flow on six benchmarks. Our results show we exceed the performance of currently published MLP accuracy results and are competitive with non-MLP based results. We compare general and common GPU architectures with our scalable FPGA design and show we can achieve higher efficiency and higher throughput (outputs per second) for the majority of datasets. Further insights into the design space for both accurate networks and high performing hardware shows the power of co-design by correlating accuracy versus throughput, network size versus accuracy, and scaling to high-performance devices.

机译：最先进的神经网络架构（NNA）具有挑战性地设计和实施硬件。在过去的几年里，这导致了自动神经结构的研究和开发爆发了。自动机工具现已用于实现最先进的NNA设计，并尝试优化硬件使用和设计。 NNA的自动设计中最近的大部分研究都集中在卷积网络和图像识别上，忽略了数据中心中工作量的重要部分是通用的深神经网络的事实。在这项工作中，我们开发和测试一个通用的多层Perceptron（MLP）流量，可以将任意数据集作为输入，并自动生产优化的NNA和硬件设计。我们在六个基准测试中测试流程。我们的结果表明，我们超越了目前已发表的MLP精度结果的表现，并具有基于非MLP的结果竞争。我们使用我们可扩展的FPGA设计进行比较General和Common GPU架构，并显示我们可以为大多数数据集实现更高的效率和更高的吞吐量（每秒输出）。进一步了解精确网络和高性能硬件的设计空间，通过关联精度与吞吐量，网络大小与精度和缩放到高性能设备来显示共设计的功率。

著录项

来源
《IEEE International Conference on Consumer Electronics》|2020年|1-6|共6页
会议地点
作者
Philip Colangelo; Oren Segal; Alex Speicher; Martin Margala;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Performance evaluation; Artificial neural networks; Tools; Throughput; Hardware; Field programmable gate arrays; Research and development;

机译：绩效评估;人工神经网络;工具;吞吐量;硬件;现场可编程门阵列;研究和开发;

相似文献

外文文献
中文文献
专利

1. Dynamic MAC-based architecture of artificial neural networks suitable for hardware implementation on FPGAs [J] . N. Nedjah, R.M. da Silva, L.M. Mourelle, Neurocomputing . 2009,第10a12期

机译：适用于FPGA硬件实现的基于动态MAC的人工神经网络架构
2. A GPU-Outperforming FPGA Accelerator Architecture for Binary Convolutional Neural Networks [J] . Li Yixing, Liu Zichuan, Xu Kai, ACM Journal on Emerging Technologies in Computing Systems . 2018,第2期

机译：用于二元卷积神经网络的GPU超级FPGA加速器架构
3. An Efficient Hardware Accelerator for Structured Sparse Convolutional Neural Networks on FPGAs [J] . Zhu Chaoyang, Huang Kejie, Yang Shuyuan, IEEE transactions on very large scale integration (VLSI) systems . 2020,第9期

机译：FPGA上结构化稀疏卷积神经网络的有效硬件加速器
4. When Neural Architecture Search Meets Hardware Implementation: from Hardware Awareness to Co-Design [C] . Xinyi Zhang, Weiwen Jiang, Yiyu Shi, IEEE Computer Society Annual Symposium on VLSI . 2019

机译：当神经体系结构搜索遇到硬件实现时：从硬件意识到协同设计
5. Layer-type Specialized Processing Engines for a Semi-Streaming Convolutional Neural Network Hardware Architecture for FPGAs [D] . Shaydyuk, Nazariy. 2020

机译：用于FPGA的半流式卷积神经网络硬件架构的层型专业处理引擎
6. Low Cost Interconnected Architecture for the Hardware Spiking Neural Networks [O] . Yuling Luo, Lei Wan, Junxiu Liu, 2018

机译：硬件互联神经网络的低成本互连架构
7. APNAS: Accuracy-and-Performance-Aware Neural Architecture Search for Neural Hardware Accelerators [O] . Paniti Achararit, Muhammad Abdullah Hanif, Rachmad Vidya Wicaksana Putra, 2020

机译：ANNAS：准确性和性能感知的神经架构搜索神经硬件加速器

Automated Hardware and Neural Network Architecture co-design of FPGA accelerators using multi-objective Neural Architecture Search

摘要

著录项

相似文献

相关主题

期刊订阅