首页> 外文期刊>IEEE Transactions on Computers >Enabling Highly Efficient Capsule Networks Processing Through Software-Hardware Co-Design
【24h】

Enabling Highly Efficient Capsule Networks Processing Through Software-Hardware Co-Design

机译:通过软件 - 硬件共同设计实现高效的胶囊网络处理

获取原文
获取原文并翻译 | 示例
       

摘要

As the demand for the image processing increases, the image features become increasingly complicated. Although the Convolutional Neural Network (CNN) have been widely adopted for the imaging processing tasks, it has been found easily misled due to the massive usage of pooling operations. A novel neural network structure called Capsule Networks (CapsNet) is proposed to address the CNN challenge and essentially enhance the learning ability for the image segmentation and object detection. Since the CapsNet contains the high volume of the matrix execution, it has been generally accelerated on modern GPU platforms with the highly optimized deep-learning library. However, the routing procedure of CapsNet introduces the special program and execution features,including massive unshareable intermediate variables and intensive synchronizations, causing inefficient CapsNet execution on modern GPU. To address these challenges, we propose the software-hardware co-designed optimizations, SH-CapsNet, which includes the software-level optimizations named S-CapsNet and a hybrid computing architecture design named PIM-CapsNet. In software-level, S-CapsNet reduces the computation and memory accesses by exploiting the computational redundancy and data similarity of the routing procedure. In hardware-level, the PIM-CapsNet leverages the processing-in-memory capability of today's 3D stacked memory to conduct the off-chip in-memory acceleration solution for the routing procedure, while pipelining with the GPU's on-chip computing capability for accelerating CNN types of layers in CapsNet. Evaluation results demonstrate that either our software or hardware optimizations can significantly improve the CapsNet execution efficiency. Together, our co-design can achieve greatly improvement on both performance ($3.41imes$3.41x) and energy savings (68.72 percent) for CapsNet inference, with negligible accuracy loss.
机译:随着对图像处理的需求增加,图像特征变得越来越复杂。虽然卷积神经网络(CNN)被广泛用于成像处理任务,但由于汇集操作的大规模使用而被发现容易误导。提出了一种名为胶囊网络(CAPSNET)的新型神经网络结构,以解决CNN挑战,基本上提高图像分割和对象检测的学习能力。由于Capsnet包含大量的矩阵执行,因此通常在具有高度优化的深度学习库的现代GPU平台上加速了它。但是,CAPSNET的路由程序引入了特殊的程序和执行功能,包括大规模的不可公行的中间变量和密集同步,导致现代GPU上的低效载波执行。为解决这些挑战,我们提出了软件 - 硬件共同设计的优化SH-CapsNet,其中包括名为S-CapsNet的软件级优化以及名为PIM-CapsNet的混合计算架构设计。在软件级中,S-CAPSNET通过利用路由过程的计算冗余和数据相似性来减少计算和内存访问。在硬件级别中,PIM-CAPSNET利用当今3D堆叠内存的加工内存能力,为路由过程进行外部内存加速度解决方案,同时通过GPU的片上计算能力进行加速,以便加速CNN类型的CAPSNET中的类型。评估结果表明,我们的软件或硬件优化可以显着提高载波的执行效率。我们的共同设计在一起可以大大改善性能(3.41倍3.41倍)和节能(68.72%)用于帽子推断,可忽略不计的准确性损失。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号