Space-address decoupled scratchpad memory management for neural network accelerators

Zhang Zhenxing; Sun Shiyan; Chen Xunyu; Zhi Tian; Guo Qi; Chen Yunji

首页> 外文期刊>Concurrency and computation: practice and experience >Space-address decoupled scratchpad memory management for neural network accelerators

【24h】

Space-address decoupled scratchpad memory management for neural network accelerators

机译：用于神经网络加速器的空间地址解耦暂存存储器管理

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Deep neural networks have been demonstrated to be useful in varieties of intelligent tasks, and various specialized NN accelerators have been proposed recently to improve the hardware efficiency, which are typically equipped with software-managed scratchpad memory (SPM) for high performance and energy efficiency. However, traditional SPM management techniques cause memory fragmentation for NN accelerators, and thus lead to low utilization of precious SPM. The main reason is that traditional techniques are originally designed for managingfixed-length registersrather thanvariable-length memory blocks. In this article, we propose a novel SPM management approach for NN accelerators. The basic intuition is that NN computation/memory behaviors are predictable and relatively regular compared with traditional applications, and thus most information can be determined at compile time. In addition, by exploiting the variable-length feature of SPM, we propose to divide the allocation process into two passes: thespace assignmentand theaddress assignmentpass, which are simultaneously (and implicitly) performed in traditional one-pass allocation techniques. Experimental results on the memory requests of a representative NN accelerator demonstrate that the proposed approach can significantly reduce the memory consumption by 30% at most compared with state-of-the-art SPM management techniques, and the memory usage is only 2% larger than that of the theoretical optimal allocation.

机译：深度神经网络已被证明在智能任务的各种品种中有用，并且最近已经提出了各种专业的NN加速器来提高硬件效率，这些硬件效率通常配备有用于高性能和能效的软件管理的ScratchPad内存（SPM）。然而，传统的SPM管理技术导致NN加速器的内存碎片，从而导致低利用珍贵的SPM。主要原因是传统技术最初设计用于管理校正长度注册器攻击性长度存储块。在本文中，我们为NN加速器提出了一种新的SPM管理方法。基本直觉是与传统应用相比，NN计算/存储器行为是可预测的，并且比较相对规则，因此可以在编译时确定大多数信息。另外，通过利用SPM的可变长度特征，我们建议将分配过程分为两个通行证：关闭分配和theaddress分配通道，它在传统的单通分配技术中同时（和隐式）。代表性NN加速器的存储器请求的实验结果表明，与最先进的SPM管理技术相比，所提出的方法可以显着降低30％的内存消耗，并且内存使用量仅为2％理论最优分配的。

著录项

来源
《Concurrency and computation: practice and experience》 |2021年第6期|e6046.1-e6046.13|共13页
作者
Zhang Zhenxing; Sun Shiyan; Chen Xunyu; Zhi Tian; Guo Qi; Chen Yunji;
展开▼
作者单位

Univ Sci & Technol China Sch Comp Sci & Technol Hefei Peoples R China|Nanjing Univ Aeronaut & Astronaut Nanjing Peoples R China|Chinese Acad Sci Inst Comp Technol SKL Comp Architecture Beijing Peoples R China;

Cambricon Technol Beijing Peoples R China;

Chinese Acad Sci Inst Comp Technol SKL Comp Architecture Beijing Peoples R China;

Nanjing Univ Aeronaut & Astronaut Nanjing Peoples R China;

Nanjing Univ Aeronaut & Astronaut Nanjing Peoples R China;

Nanjing Univ Aeronaut & Astronaut Nanjing Peoples R China|Univ Chinese Acad Sci Sch Comp Sci & Technol Beijing Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
deep neural network; memory management; scratchpad memory;

机译：深度神经网络;内存管理;临时记忆;

相似文献

外文文献
中文文献
专利

1. ROMANet: Fine-Grained Reuse-Driven Off-Chip Memory Access Management and Data Organization for Deep Neural Network Accelerators [J] . Putra Rachmad Vidya Wicaksana, Hanif Muhammad Abdullah, Shafique Muhammad IEEE transactions on very large scale integration (VLSI) systems . 2021,第4期

机译：Romanet：细粒度重用驱动的外部内存访问管理和数据组织，用于深度神经网络加速器
2. Automated optimization for memory‐efficient high‐performance deep neural network accelerators [J] . HyunMi Kim, Chun‐Gi Lyuh, Youngsu Kwon ETRI journal . 2020,第4期

机译：内存高效的高性能深度神经网络加速器的自动优化
3. Optimizing Memory Efficiency for Deep Convolutional Neural Network Accelerators [J] . Li Xiaowei, Li Jiajun, Yan Guihai Journal of Low Power Electronics . 2018,第4期

机译：优化深度卷积神经网络加速器的内存效率
4. Design Tradeoff of Internal Memory Size and Memory Access Energy in Deep Neural Network Hardware Accelerators [C] . Shen-Fu Hsiao, Pei-Hsuen Wu IEEE Global Conference on Consumer Electronics . 2018

机译：深度神经网络硬件加速器中内部内存大小和内存访问能量的设计折衷
5. Reducing Off-chip Memory Accesses in Deep Neural Network Accelerators [D] . Siu, Kevin. 2019

机译：减少深度神经网络加速器中的片外存储器访问
6. A 181 GOPS AKAZE Accelerator Employing Discrete-Time Cellular Neural Networks for Real-Time Feature Extraction [O] . Guangli Jiang, Leibo Liu, Wenping Zhu, 2015

机译：采用离散时间细胞神经网络的181 GOPS AKAZE加速器用于实时特征提取
7. Exploration of task-based scheduling for convolutional neural networks accelerators under memory constraints [O] . Crefeda Faviola Rodrigues, Graham Riley, Mikel Luján 2019

机译：基于任务的神经网络加速器在内存约束下的探索

Space-address decoupled scratchpad memory management for neural network accelerators

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅