TileNET: Scalable Architecture for High-Throughput Ternary Convolution Neural Networks Using FPGAs

机译：TileNET：使用FPGA的高通量三元卷积神经网络的可扩展架构

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Convolution Neural Networks (CNNs) are becoming increasing popular in Advanced driver assistance systems (ADAS) and Autonomated driving (AD) for camera perception enabling multiple applications like object detection, lane detection and semantic segmentation. Ever increasing need for high resolution multiple cameras around car necessitates a huge-throughput in the order of about few 10's of TeraMACs per second (TMACS) along with high accuracy of detection. Existing implementations do not scale, with performance ranging only in the order of a few Giga operations per second. This paper, proposes a novel tiled architecture for CNNs that uses only ternarized weights, while input and output features are kept full precision resulting in minimal loss of accuracy. The proposed solution is implemented on Virtex-7 FPGA resulting in throughput of 13.76 TOPS. The post-implementation power simulation for AlexNet consumes 16 W, orders of magnitude lower than exist in GPUs.

机译：卷积神经网络（CNN）在用于摄像机感知的高级驾驶员辅助系统（ADAS）和自动驾驶（AD）中变得越来越流行，从而实现了对象检测，车道检测和语义分割等多种应用。汽车周围对高分辨率多摄像头的需求不断增长，因此需要每秒大约10的TeraMAC（TMACS）数量级的巨大吞吐量以及较高的检测精度。现有的实现无法扩展，其性能仅在每秒几个Giga操作的数量级范围内。本文提出了一种用于CNN的新颖平铺架构，该架构仅使用加权权重，而输入和输出功能则保持完整精度，从而将精度损失降至最低。所提出的解决方案在Virtex-7 FPGA上实现，从而实现了13.76 TOPS的吞吐量。 AlexNet的实施后功耗仿真消耗的功率为16 W，比GPU中的功耗低几个数量级。

著录项

来源
《International Conference on VLSI Design;International Conference on Embedded Systems》|2018年|461-462|共2页
会议地点
作者
Sahu Sai Vikram; Vibha Panty; Mihir Mody; Madhura Purnaprajna;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Convolution; Acceleration; Field programmable gate arrays; Throughput; Computer architecture; Neural networks; Electronic mail;

机译：卷积;加速;现场可编程门阵列;吞吐量;计算机体系结构;神经网络;电子邮件;

相似文献

外文文献
中文文献
专利

1. TileNET: Hardware accelerator for ternary Convolutional Neural Networks [J] . Eetha Sagar, Sruthi P. K., Pant Vibha, Microprocessors and microsystems . 2021,第Juna期

机译：Tilenet：三元卷积神经网络的硬件加速器
2. A fast and scalable architecture to run convolutional neural networks in low density FPGAs [J] . Vestias Mario P., Duarte Rui P., de Sousa Jose T., Microprocessors and microsystems . 2020,第Sepa期

机译：一种快速且可扩展的架构，可在低密度FPGA中运行卷积神经网络
3. A GPU-Outperforming FPGA Accelerator Architecture for Binary Convolutional Neural Networks [J] . Li Yixing, Liu Zichuan, Xu Kai, ACM Journal on Emerging Technologies in Computing Systems . 2018,第2期

机译：用于二元卷积神经网络的GPU超级FPGA加速器架构
4. TileNET: Scalable Architecture for High-Throughput Ternary Convolution Neural Networks Using FPGAs [C] . Sahu Sai Vikram, Vibha Panty, Mihir Mody, International Conference on VLSI Design . 2018

机译：Tilenet：用于高吞吐量三元卷积神经网络的可扩展架构，使用FPGA
5. Layer-type Specialized Processing Engines for a Semi-Streaming Convolutional Neural Network Hardware Architecture for FPGAs [D] . Shaydyuk, Nazariy. 2020

机译：用于FPGA的半流式卷积神经网络硬件架构的层型专业处理引擎
6. A Scalable FPGA Architecture for Randomly Connected Networks of Hodgkin-Huxley Neurons [O] . Kaveh Akbarzadeh-Sherbaf, Behrooz Abdoli, Saeed Safari, 2018

机译：用于霍奇金-赫克斯利神经元随机连接网络的可扩展FPGA架构
7. Scalable High-Performance Architecture for Convolutional Ternary Neural Networks on FPGA [O] . Prost-Boucle, Adrien, BOURGE, Alban, Pétrot, Frédéric, 2017

机译：FPGA上卷积三元神经网络的可扩展高性能架构

TileNET: Scalable Architecture for High-Throughput Ternary Convolution Neural Networks Using FPGAs

摘要

著录项

相似文献

相关主题

期刊订阅