首页> 外文期刊>IEEE transactions on very large scale integration (VLSI) systems >ROMANet: Fine-Grained Reuse-Driven Off-Chip Memory Access Management and Data Organization for Deep Neural Network Accelerators
【24h】

ROMANet: Fine-Grained Reuse-Driven Off-Chip Memory Access Management and Data Organization for Deep Neural Network Accelerators

机译:Romanet:细粒度重用驱动的外部内存访问管理和数据组织,用于深度神经网络加速器

获取原文
获取原文并翻译 | 示例

摘要

Enabling high energy efficiency is crucial for embedded implementations of deep learning. Several studies have shown that the DRAM-based off-chip memory accesses are one of the most energy-consuming operations in deep neural network (DNN) accelerators and, thereby, limit the designs from achieving efficiency gains at the full potential. DRAM access energy varies depending upon the number of accesses required and the energy consumed per-access. Therefore, searching for a solution toward the minimum DRAM access energy is an important optimization problem. Toward this, we propose the ROMANet methodology that aims at reducing the number of memory accesses, by searching for the appropriate data partitioning and scheduling for each layer of a network using a design space exploration, based on the knowledge of the available on-chip memory and the data reuse factors. Moreover, ROMANet also targets decreasing the number of DRAM row buffer conflicts and misses by exploiting the DRAM multibank burst feature to improve the energy-per-access. Besides providing the energy benefits, our proposed DRAM data mapping also results in an increased effective DRAM throughput, which is useful for latency-constraint scenarios. Our experimental results show that the ROMANet saves DRAM access energy by 12% for the AlexNet, 36% for the VGG-16, 46% for the MobileNet, and 45% for the SqueezeNet while improving the DRAM throughput by 10% on average across different networks compared to the state of the art, i.e., bus-width aware (BWA) technique.
机译:实现高能量效率对于深度学习的嵌入实现至关重要。有几项研究表明,基于DRAM的片外存储器访问是深度神经网络(DNN)加速器中最能耗的操作之一,从而限制了在全潜力下实现效率增益的设计。 DRAM访问能量根据所需的访问数量和每个访问的能量而异。因此,寻找最小DRAM接入能量的解决方案是一个重要的优化问题。对此,我们提出了旨在通过设计空间探索的适当的数据分区和调度,基于可用的片上存储器的知识来搜索用于减少内存访问数量的罗马特方法。和数据重用因子。此外,Romanet还通过​​利用DRAM MultiBank Burst功能来降低DRAM行缓冲区冲突和未命中的次数,以提高能量访问权限。除了提供能量效益外,我们提出的DRAM数据映射还会导致有效的DRAM吞吐量增加,这对于延迟约束方案有用。我们的实验结果表明,Romanet将DRAM接入能量降低了12%,对于MobileNet的VGG-16,46%,45%,挤压率为45%,同时平均将DRAM吞吐量提高10%网络与现有技术相比,即总线宽度感知(BWA)技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号