首页> 外文会议>IEEE International Workshop on Signal Processing Systems >Bandwidth Efficient Architectures for Convolutional Neural Network
【24h】

Bandwidth Efficient Architectures for Convolutional Neural Network

机译:卷积神经网络的带宽有效架构

获取原文

摘要

In recent years, Convolutional Neural Network (CNN) has been rapidly evolving and the real-time CNN implementations in embedded systems are becoming highly demanding. It is necessary that high performance and real time CNN based implementations be realized in local processors. Conventional approaches designing CNN accelerators focus on reducing the computational workload of CNNs. However, the limited external memory bandwidth has become the main bottleneck of CNN acceleration in embedded systems. Because in deep and large CNN models the feature map pixels and weights, which are numerous and must be stored in external memories, need to be exchanged between off-chip and on-chip memories frequently. Hence the performance is constrained by the limited external memory bandwidth. In this paper, bandwidth efficient architectures for CNN implementation are proposed. The intermediate pixel data are stored on chip and kernel weights are transferred in an efficient way. Compared to mainstream CNN implementation methods, the proposed architectures can efficiently utilize external memory bandwidth while preserving the original throughput.
机译:近年来,卷积神经网络(CNN)迅速发展,并且嵌入式系统中的实时CNN实施要求越来越高。必须在本地处理器中实现基于高性能和实时CNN的实现。设计CNN加速器的常规方法着重于减少CNN的计算工作量。但是,有限的外部存储器带宽已成为嵌入式系统中CNN加速的主要瓶颈。因为在较深的大型CNN模型中,特征图像素和权重(必须存储在外部存储器中)数量众多,因此必须经常在片外和片内存储器之间进行交换。因此,性能受到有限的外部存储器带宽的限制。在本文中,提出了用于CNN实现的带宽高效架构。中间像素数据存储在芯片上,内核权重得到有效传输。与主流的CNN实现方法相比,所提出的体系结构可以有效利用外部存储器带宽,同时保留原始吞吐量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号