Buffer Sizes Reduction for Memory-efficient CNN Inference on Mobile and Embedded Devices

机译：减少缓冲区大小，以便在移动和嵌入式设备上实现内存高效的CNN推理

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Nowadays, convolutional neural networks (CNNs) are the core of many intelligent systems, including those that run on mobile and embedded devices. However, the execution of computationally demanding and memory-hungry CNNs on resource-limited mobile and embedded devices is quite challenging. One of the main problems, when running CNNs on such devices, is the limited amount of memory available. Thus, reduction of the CNN memory footprint is crucial for the CNN inference on mobile and embedded devices. The CNN memory footprint is determined by the amount of memory required to store CNN parameters (weights and biases) and intermediate data, exchanged between CNN operators. The most common approaches, utilized to reduce the CNN memory footprint, such as pruning and quantization, reduce the memory required to store the CNN parameters. However, these approaches decrease the CNN accuracy. Moreover, with the increasing depth of the state-of-the-art CNNs, the intermediate data exchanged between CNN operators takes even more space than the CNN parameters. Therefore, in this paper, we propose a novel approach, which allows to reduce the memory, required to store intermediate data, exchanged between CNN operators. Unlike pruning and quantization approaches, our proposed approach preserves the CNN accuracy and reduces the CNN memory footprint at the cost of decreasing the CNN throughput. Rus, our approach is orthogonal to the pruning and quantization approaches, and can be combined with these approaches for further CNN memory footprint reduction.

机译：如今，卷积神经网络（CNNS）是许多智能系统的核心，包括在移动设备和嵌入式设备上运行的核心。然而，在资源限制的移动和嵌入式设备上执行计算所需的饥饿CNNS是非常具有挑战性的。在此类设备上运行CNN时的主要问题之一，是可用的有限的内存量。因此，CNN存储器足迹的减少对于移动和嵌入式设备上的CNN推断是至关重要的。 CNN存储器占用由存储CNN参数（权重和偏置）和中间数据所需的内存量决定，在CNN运营商之间交换。用于降低CNN存储空间的最常见方法，例如修剪和量化，减少存储CNN参数所需的存储器。然而，这些方法降低了CNN精度。此外，随着最先进的CNN的深度增加，在CNN操作员之间交换的中间数据比CNN参数更加空间。因此，在本文中，我们提出了一种新的方法，该方法允许减少存储中间数据所需的存储器，在CNN运算符之间交换。与修剪和量化方法不同，我们所提出的方法保留了CNN精度，并以降低CNN吞吐量的成本降低了CNN存储空间。 RUS，我们的方法与修剪和量化方法正交，并且可以与这些方法相结合，以进一步的CNN存储器占据减少。

著录项

来源
《Euromicro Conference on Digital System Design》|2020年|133-140|共8页
会议地点
作者
Svetlana Minakova; Todor Stefanov;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Memory management; Computational modeling; Data models; Quantization (signal); Throughput; Task analysis; Microsoft Windows;

机译：内存管理;计算建模;数据模型;量化（信号）;吞吐量;任务分析; Microsoft Windows;

相似文献

外文文献
中文文献
专利

1. A NEW ADAPTIVE GREY DECISION-ENERGY AWARE MANAGEMENT SYSTEM BASED ON THE OPTIMAL-READ ONLY-WRITE BUFFER ARCHITECTURE FOR FLASH MEMORY IN EMBEDDED AND MOBILE DEVICES [J] . W. T. Huang, C. H. Chen, H.-D. J. Jeong, Intelligent automation and soft computing . 2010,第4期

机译：一种基于最优读取的仅写缓冲区结构的嵌入式和移动设备自适应灰色决策能量预警管理系统
2. Nonvolatile Write Buffer-Based Journaling Bypass for Storage Write Reduction in Mobile Devices [J] . Mungyu Son, Junwhan Ahn, Sungjoo Yoo Computer-Aided Design of Integrated Circuits and Systems, IEEE Transactions on . 2018,第9期

机译：基于非易失性写缓冲区的日记旁路，可减少移动设备中的存储写
3. Memory-efficient buffering method and enhanced reference template for embedded automatic speech recognition system [J] . Chou Chih-Hung, Kuan Ta-Wen, Lin Po-Chuan, Computers & Digital Techniques, IET . 2015,第3期

机译：嵌入式自动语音识别系统的内存有效缓冲方法和增强参考模板
4. An In-Memory Analog Computing Co-Processor for Energy-Efficient CNN Inference on Mobile Devices [C] . Mohammed Elbtity, Abhishek Singh, Brendan Reidy, IEEE Computer Society Annual Symposium on VLSI . 2021

机译：用于移动设备上的节能CNN推理的内存模拟计算协同处理器
5. Kernel Mechanisms for Efficient GPU Accelerated Deep Neural Network Inference on Embedded Devices [D] . Nigam, Hemant. 2018

机译：高效GPU的内核机制加速了对嵌入式设备的深神经网络推断
6. Fast CNN Stereo Depth Estimation through Embedded GPU Devices [O] . Cristhian A. Aguilera, Cristhian Aguilera, Cristóbal A. Navarro, 2020

机译：通过嵌入式GPU设备进行快速CNN立体声深度估计
7. Memory-Efficient Dataflow Inference for Deep CNNs on FPGA [O] . Lucian Petrica, Tobias Alonso, Mairin Kroes, 2020

机译：在FPGA上为Deep CNN的内存高效数据流推理

Buffer Sizes Reduction for Memory-efficient CNN Inference on Mobile and Embedded Devices

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅