首页> 外文会议>International Conference on VLSI Design;International Conference on Embedded Systems >Towards Near Data Processing of Convolutional Neural Networks
【24h】

Towards Near Data Processing of Convolutional Neural Networks

机译:走向卷积神经网络的近数据处理

获取原文

摘要

The gap between the processing speed of the CPU and the access speed of the memory is becoming a bottleneck for many data intensive applications. This gap can be reduced if the computation can be taken near to the data. Recent advancement in memory technology has made it feasible to have 3D stacked memory along with the capability of having an integrated logic layer, thus making near data processing (NDP) feasible. Convolutional Neural Networks (CNNs) are used in a wide range of applications such as image processing, video analysis, natural language processing, etc. They are data intensive and have highly parallel computations. If we can perform them near the memory, we can achieve higher throughput. This paper proposes a CNN Logic Unit (CLU) as a hardware implementation in the logic layer associated with 3D stacked memory, e.g. the Hybrid Memory Cube (HMC) implementing the concept of NDP. Full system simulation results show that the method is very promising as it gives several times improvement for CNN operations when performed near memory as compared to in conventional CPU based systems with DRAM. We get an improvement of 76x in performance while energy consumption is reduced by 55x.
机译:CPU的处理速度与内存的访问速度之间的差距正在成为许多数据密集型应用程序的瓶颈。如果可以在数据附近进行计算,则可以减少此差距。存储器技术的最新发展使得具有3D堆栈存储器以及具有集成逻辑层的能力成为可能,从而使近数据处理(NDP)成为可能。卷积神经网络(CNN)广泛用于图像处理,视频分析,自然语言处理等应用。它们是数据密集型的并且具有高度并行的计算。如果我们可以在内存附近执行它们,则可以实现更高的吞吐量。本文提出了CNN逻辑单元(CLU)作为与3D堆栈存储器(例如,3D堆栈存储器)关联的逻辑层中的硬件实现。实现NDP概念的混合内存多维数据集(HMC)。完整的系统仿真结果表明,该方法非常有前途,因为与在传统的基于CPU的DRAM系统中相比,在近存储器中执行CNN操作时,该方法可提供数倍的改进。我们将性能提高了76倍,而能耗却降低了55倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号