Accelerating Deep Neural Network Computation on a Low Power Reconfigurable Architecture

机译：在低功耗可重配置架构上加速深度神经网络计算

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recent work on neural network architectures has focused on bridging the gap between performance/efficiency and programmability. We consider implementations of three popular neural networks, ResNet, AlexNet and ASGD weight-dropped Recurrent Neural Network (AWD RNN) on a low power programmable architecture, Transformer. The architecture consists of light-weight cores interconnected by caches and crossbars that support run-time reconfiguration between shared and private cache mode operations. We present efficient implementations of key neural network kernels and evaluate the performance of each kernel when operating in different cache modes. The best-performing cache modes are then used in the implementation of the end-to-end network. Simulation results show superior performance with ResNet, AlexNet and AWD RNN achieving 188.19 GOPS/W, 150.53 GOPS/W and 120.68 GOPS/W, respectively, in the 14 nm technology node.

机译：关于神经网络体系结构的最新工作集中在弥合性能/效率与可编程性之间的鸿沟上。我们考虑在低功耗可编程架构Transformer上实现三种流行的神经网络ResNet，AlexNet和ASGD减重递归神经网络（AWD RNN）。该体系结构由轻量级内核组成，这些轻量级内核通过缓存和交叉开关互连，支持共享和专用缓存模式操作之间的运行时重新配置。我们介绍了关键神经网络内核的有效实现，并评估了在不同缓存模式下运行时每个内核的性能。然后，在端到端网络的实现中使用性能最佳的缓存模式。仿真结果表明，在14 nm技术节点上，ResNet，AlexNet和AWD RNN分别具有188.19 GOPS / W，150.53 GOPS / W和120.68 GOPS / W的卓越性能。

著录项

来源
《IEEE International Symposium on Circuits and Systems》|2020年|1-5|共5页
会议地点
作者
Y. Xiong; J. Zhou; S. Pal; D. Blaauw; H. S. Kim; T. Mudge; R. Dreslinski; C. Chakrabarti;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Kernel; Convolution; Two dimensional displays; Computer architecture; Recurrent neural networks; Transformer cores;

机译：内核;卷积;二维显示;计算机体系结构;递归神经网络;变压器核;

相似文献

外文文献
中文文献
专利

1. Deep Convolutional Neural Network Architecture With Reconfigurable Computation Patterns [J] . Fengbin Tu, Shouyi Yin, Peng Ouyang, IEEE transactions on very large scale integration (VLSI) systems . 2017,第8期

机译：具有可重构计算模式的深度卷积神经网络架构
2. VLSI Architecture for Fast Computation of 2D-Discrete Wavelet Transform and Low Power Feed Forward Neural Network Architecture for Image Compression [J] . Mr. Murali Mohan. S, Dr. P.Satyanarayana American Journal of Engineering Research . 2013,第10期

机译：用于二维离散小波变换快速计算的VLSI架构和用于图像压缩的低功耗前馈神经网络架构
3. SLIT: An Energy-Efficient Reconfigurable Hardware Architecture for Deep Convolutional Neural Networks [J] . Tran T.D., Nakashima Y. IEICE Transactions on Electronics . 2021,第7期

机译：SLIT：用于深卷积神经网络的节能可重新配置硬件架构
4. Reconfigurable co-processor architecture with limited numerical precision to accelerate deep convolutional neural networks [C] . Sasindu Wijeratne, Sandaruwan Jayaweera, Mahesh Dananjaya, International Conference on Application-specific Systems, Architectures and Processors . 2018

机译：具有有限数值精度的可重配置协处理器架构，可加速深度卷积神经网络
5. Low-Power and Reconfigurable Asynchronous ASIC Design Implementing Recurrent Neural Networks [D] . Nelson, Spencer. 2021

机译：低功耗和可重新配置的异步ASIC设计实现经常性神经网络
6. The benefit of combining a deep neural network architecture with ideal ratio mask estimation in computational speech segregation to improve speech intelligibility [O] . Thomas Bentsen, Tobias May, Abigail A. Kressner, 2012

机译：在计算语音隔离中将深度神经网络架构与理想比率掩码估计相结合的好处，可以提高语音清晰度
7. CATERPILLAR: Coarse Grain Reconfigurable Architecture for Accelerating the Training of Deep Neural Networks [O] . Li, Yuanfang, Pedram, Ardavan 2017

机译：CaTERpILLaR：用于加速的粗粒可重构架构深度神经网络的训练

Accelerating Deep Neural Network Computation on a Low Power Reconfigurable Architecture

摘要

著录项

相似文献

相关主题

期刊订阅