ARA: Cross-Layer approximate computing framework based reconfigurable architecture for CNNs

Gong Yu; Liu Bo; Ge Wei; Shi Longxing

首页> 外文期刊>Microelectronics journal >ARA: Cross-Layer approximate computing framework based reconfigurable architecture for CNNs

【24h】

ARA: Cross-Layer approximate computing framework based reconfigurable architecture for CNNs

机译：ARA：基于CNN的基于可重新配置架构的跨层近似计算框架

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Convolution Neural Networks are now widely used in image processing, object detection, video detection, and other classification tasks. Thus the acceleration of CNN is also widely researched for its complex computation features and data dependence. To achieve high energy efficiency, we proposed a CNN accelerator with approximate computing techniques. In this paper, two main aspects are studied: the hardware-compatible network compression algorithms, and the approximate computing units and architectures with hardware resource scheduling strategies. For the algorithm approximation part, we introduce a dynamic layered CNN structure for different scales of input, the convolution kernel shrinking strategy with layer-by-layer quantization to compress networks, and the Winograd Minimum Filter algorithm to decrease operations in convolution layers. For the architecture part, two types of approximate multipliers are innovated as iterative multipliers, and multi-port SRAM integrated LUT based multipliers. Approximate adders with error correction logic are also designed. Based on the approximate computing units, the Convolution Neural Processing Unit named CNPU is proposed with reconfigurable datapath designs for the mapping of different tasks. By the work on the algorithm, the CNPU architecture and the datapath design, we propose a high energy efficient reconfigurable CNN accelerator with approximate computing named ARA (Approximate computing based Reconfigurable Architecture). Implemented under TSMC 45 nm process, our accelerator achieves 1.92TOPS/W@ 1.1 V, 200 MHz and 3.72TOPS/W@ 0.9 V, 40 MHz in energy-efficiency, which is 1.51 similar to 4.36 times better than the state-of-the-art accelerators.

机译：卷积神经网络现在广泛用于图像处理，对象检测，视频检测和其他分类任务。因此，CNN的加速度也被广泛研究其复杂的计算特征和数据依赖性。为了实现高能量效率，我们提出了一种具有近似计算技术的CNN加速器。在本文中，研究了两个主要方面：硬件兼容的网络压缩算法以及具有硬件资源调度策略的近似计算单元和架构。对于算法近似部分，我们引入了用于不同输入的动态分层CNN结构，卷积核缩小策略具有逐层量化，以压缩网络，以及WinoGrad最小滤波器算法，以减少卷积层中的操作。对于架构部分，两种类型的近似乘法器被创新为迭代乘法器，以及基于多端口SRAM集成LUT的乘法器。还设计了具有纠错逻辑的近似添加剂。基于近似计算单元，提出了名为CNPU的卷积神经处理单元，用于映射不同任务的可重构数据路径设计。通过对算法的工作，CNPU架构和数据路径设计，我们提出了一种具有名为ARA的近似计算的高能效可重构CNN加速器（基于近似计算的可重新配置架构）。在TSMC 45 NM过程下实施，我们的加速器实现1.92TOPS / W @ 1.1 V，200 MHz和3.72TOP / W @ 0.9 V，40 MHz，其能效为1.51，其与最多的4.36倍相似。最艺术加速器。

著录项

来源
《Microelectronics journal》 |2019年第5期|33-44|共12页
作者
Gong Yu; Liu Bo; Ge Wei; Shi Longxing;
展开▼
作者单位

Southeast Univ Natl ASIC Syst Engn Technol Res Ctr Nanjing 210096 Jiangsu Peoples R China;

Southeast Univ Natl ASIC Syst Engn Technol Res Ctr Nanjing 210096 Jiangsu Peoples R China;

Southeast Univ Natl ASIC Syst Engn Technol Res Ctr Nanjing 210096 Jiangsu Peoples R China;

Southeast Univ Natl ASIC Syst Engn Technol Res Ctr Nanjing 210096 Jiangsu Peoples R China;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Convolution Neural Networks; Approximate computing; Reconfigurable computing;

机译：卷积神经网络;近似计算;可重新配置计算;

相似文献

外文文献
中文文献
专利

1. ARA: Cross-Layer approximate computing framework based reconfigurable architecture for CNNs [J] . Gong Yu, Liu Bo, Ge Wei, Microelectronics journal . 2019,第MAY期

机译：ARA：基于交叉层近似计算框架的CNN可重构架构
2. Toward Approximate Computing for Coarse-Grained Reconfigurable Architectures [J] . Omid Akbari, Mehdi Kamal, Ali Afzali-Kusha, IEEE Micro . 2018,第6期

机译：面向粗粒度可重构体系结构的近似计算
3. E-ERA: An energy-efficient reconfigurable architecture for RNNs using dynamically adaptive approximate computing [J] . Bo Liu, Wei Dong, Tingting Xu, IEICE Electronics Express . 2017,第15期

机译：E-ERA：使用动态自适应近似计算的RNN高效节能可重配置架构
4. Invited: Cross-layer approximate computing: From logic to architectures [C] . Muhammad Shafique, Rehan Hafiz, Semeen Rehman, ACM/EDAC/IEEE Design Automation Conference . 2016

机译：受邀：跨层近似计算：从逻辑到架构
5. Approximate computing: An integrated cross-layer framework. [D] . Venkataramani, Swagath. 2016

机译：近似计算：集成的跨层框架。
6. SatEC: A 5G Satellite Edge Computing Framework Based on Microservice Architecture [O] . Lei Yan, Suzhi Cao, Yongsheng Gong, 2019

机译：SatEC：基于微服务架构的5G卫星边缘计算框架
7. E-ERA: An energy-efficient reconfigurable architecture for RNNs using dynamically adaptive approximate computing [O] . Bo Liu, Wei Dong, Tingting Xu, 2017

机译：E-ERA：使用动态自适应近似计算的RNN的节能可重新配置架构
8. Computing Science: Studying the Interplay of Concurrency, Performance, Energy and Reliability with ArchOn - an Architecture-Open Resource-Driven Cross-Layer Modelling Framework. [R] . Rafiev, A., Iliasov, A., Romanovsky, A., 2014

机译：计算科学：使用archOn研究并发性，性能，能量和可靠性的相互作用 - 一种架构 - 开放资源驱动的跨层建模框架。

ARA: Cross-Layer approximate computing framework based reconfigurable architecture for CNNs

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅