首页> 外文会议>Euromicro Conference on Digital System Design >Buffer Sizes Reduction for Memory-efficient CNN Inference on Mobile and Embedded Devices
【24h】

Buffer Sizes Reduction for Memory-efficient CNN Inference on Mobile and Embedded Devices

机译:减少缓冲区大小,以便在移动和嵌入式设备上实现内存高效的CNN推理

获取原文
获取外文期刊封面目录资料

摘要

Nowadays, convolutional neural networks (CNNs) are the core of many intelligent systems, including those that run on mobile and embedded devices. However, the execution of computationally demanding and memory-hungry CNNs on resource-limited mobile and embedded devices is quite challenging. One of the main problems, when running CNNs on such devices, is the limited amount of memory available. Thus, reduction of the CNN memory footprint is crucial for the CNN inference on mobile and embedded devices. The CNN memory footprint is determined by the amount of memory required to store CNN parameters (weights and biases) and intermediate data, exchanged between CNN operators. The most common approaches, utilized to reduce the CNN memory footprint, such as pruning and quantization, reduce the memory required to store the CNN parameters. However, these approaches decrease the CNN accuracy. Moreover, with the increasing depth of the state-of-the-art CNNs, the intermediate data exchanged between CNN operators takes even more space than the CNN parameters. Therefore, in this paper, we propose a novel approach, which allows to reduce the memory, required to store intermediate data, exchanged between CNN operators. Unlike pruning and quantization approaches, our proposed approach preserves the CNN accuracy and reduces the CNN memory footprint at the cost of decreasing the CNN throughput. Rus, our approach is orthogonal to the pruning and quantization approaches, and can be combined with these approaches for further CNN memory footprint reduction.
机译:如今,卷积神经网络(CNNS)是许多智能系统的核心,包括在移动设备和嵌入式设备上运行的核心。然而,在资源限制的移动和嵌入式设备上执行计算所需的饥饿CNNS是非常具有挑战性的。在此类设备上运行CNN时的主要问题之一,是可用的有限的内存量。因此,CNN存储器足迹的减少对于移动和嵌入式设备上的CNN推断是至关重要的。 CNN存储器占用由存储CNN参数(权重和偏置)和中间数据所需的内存量决定,在CNN运营商之间交换。用于降低CNN存储空间的最常见方法,例如修剪和量化,减少存储CNN参数所需的存储器。然而,这些方法降低了CNN精度。此外,随着最先进的CNN的深度增加,在CNN操作员之间交换的中间数据比CNN参数更加空间。因此,在本文中,我们提出了一种新的方法,该方法允许减少存储中间数据所需的存储器,在CNN运算符之间交换。与修剪和量化方法不同,我们所提出的方法保留了CNN精度,并以降低CNN吞吐量的成本降低了CNN存储空间。 RUS,我们的方法与修剪和量化方法正交,并且可以与这些方法相结合,以进一步的CNN存储器占据减少。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号