14.2 DNPU: An 8.1TOPS/W reconfigurable CNN-RNN processor for general-purpose deep neural networks

机译：14.2 DNPU：用于通用深度神经网络的8.1TOPS / W可重配置CNN-RNN处理器

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recently, deep learning with convolutional neural networks (CNNs) and recurrent neural networks (RNNs) has become universal in all-around applications. CNNs are used to support vision recognition and processing, and RNNs are able to recognize time varying entities and to support generative models. Also, combining both CNNs and RNNs can recognize time varying visual entities, such as action and gesture, and to support image captioning [1]. However, the computational requirements in CNNs are quite different from those of RNNs. Fig. 14.2.1 shows a computation and weight-size analysis of convolution layers (CLs), fully-connected layers (FCLs) and RNN-LSTM layers (RLs). While CLs require a massive amount of computation with a relatively small number of filter weights, FCLs and RLs require a relatively small amount of computation with a huge number of filter weights. Therefore, when FCLs and RLs are accelerated with SoCs specialized for CLs, they suffer from high memory transaction costs, low PE utilization, and a mismatch of the computational patterns. Conversely, when CLs are accelerated with FCL- and RL-dedicated SoCs, they cannot exploit reusability and achieve required throughput. So far, works have considered acceleration of CLs, such as [2–4], or FCLs and RLs like [5]. However, there has been no work on a combined CNN-RNN processor. In addition, a highly reconfigurable CNN-RNN processor with high energy-efficiency is desirable to support general-purpose deep neural networks (DNNs).

机译：最近，使用卷积神经网络（CNN）和递归神经网络（RNN）进行深度学习已在所有应用程序中普及。 CNN用于支持视觉识别和处理，RNN能够识别时变实体并支持生成模型。同样，结合CNN和RNN可以识别时变视觉实体，例如动作和手势，并支持图像字幕[1]。但是，CNN的计算要求与RNN的计算要求完全不同。图14.2.1显示了卷积层（CL），全连接层（FCL）和RNN-LSTM层（RL）的计算和权重大小分析。尽管CL需要使用相对较少的滤波器权重进行大量计算，但FCL和RL需要使用相对大量的滤波器权重进行相对少量的计算。因此，当使用专用于CL的SoC加速FCL和RL时，它们将遭受高昂的内存交易成本，低PE使用率以及计算模式不匹配的困扰。相反，当使用FCL和RL专用SoC加速CL时，它们将无法利用可重用性并无法达到所需的吞吐量。到目前为止，工作已经考虑了CL的加速，例如[2-4]，或者FCL和RL的加速，例如[5]。但是，尚未对组合的CNN-RNN处理器进行任何工作。另外，需要具有高能效的高度可重构的CNN-RNN处理器来支持通用深度神经网络（DNN）。

著录项

来源
《IEEE International Solid- State Circuits Conference》|2017年|240-241|共2页
会议地点
作者
Dongjoo Shin; Jinmook Lee; Jinsu Lee; Hoi-Jun Yoo;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Program processors; Convolution; Energy efficiency; Computer architecture; Recurrent neural networks; Acceleration;

机译：程序处理器;卷积;能效;计算机体系结构;递归神经网络;加速;

相似文献

外文文献
中文文献
专利

1. SLIT: An Energy-Efficient Reconfigurable Hardware Architecture for Deep Convolutional Neural Networks [J] . Tran T.D., Nakashima Y. IEICE Transactions on Electronics . 2021,第7期

机译：SLIT：用于深卷积神经网络的节能可重新配置硬件架构
2. Base-Reconfigurable Segmented Logarithmic Quantization and Hardware Design for Deep Neural Networks [J] . Xu Jiawei, Huan Yuxiang, Jin Yi, Journal of VLSI signal processing systems for signal, image, and video technology . 2020,第11期

机译：基础可重构分段对数量化和深度神经网络硬件设计
3. Self-reconfigurable facade-cleaning robot equipped with deep-learning-based crack detection based on convolutional neural networks [J] . Kouzehgar Maryam, Tamilselvam Yokhesh Krishnasamy, Heredia Manuel Vega, Automation in construction . 2019,第Deca期

机译：基于卷积神经网络的基于深度学习的裂缝检测的可重构门面清洁机器人
4. 14.2 DNPU: An 8.1TOPS/W reconfigurable CNN-RNN processor for general-purpose deep neural networks [C] . Dongjoo Shin, Jinmook Lee, Jinsu Lee, IEEE International Solid-State Circuits Conference . 2017

机译：14.2 DNPU：用于通用深神经网络的8.1秒/ W可重新配置CNN-RNN处理器
5. Natural Language Processing by Deep Neural Networks [D] . Zhang, Canlin. 2020

机译：深神经网络的自然语言处理
6. Implementation-Independent Representation for Deep Convolutional Neural Networks and Humans in Processing Faces [O] . Yiying Song, Yukun Qu, Shan Xu, 2020

机译：在处理面的深度卷积神经网络和人类的实施方式独立于
7. CATERPILLAR: Coarse Grain Reconfigurable Architecture for Accelerating the Training of Deep Neural Networks [O] . Li, Yuanfang, Pedram, Ardavan 2017

机译：CaTERpILLaR：用于加速的粗粒可重构架构深度神经网络的训练

14.2 DNPU: An 8.1TOPS/W reconfigurable CNN-RNN processor for general-purpose deep neural networks

摘要

著录项

相似文献

相关主题

期刊订阅