首页> 外文会议>International Symposium on Advanced Parallel Processing Technologies >SPART: Optimizing CNNs by Utilizing Both Sparsity of Weights and Feature Maps
【24h】

SPART: Optimizing CNNs by Utilizing Both Sparsity of Weights and Feature Maps

机译:SPART:通过同时使用权重稀疏度和特征图来优化CNN

获取原文

摘要

Intense convolution computation and great memory requirement in CNNs constraint their wider deployments and applications. Although both the weights and feature maps in CNNs can be sparse, directly mapping sparse convolution to spGEMM in HPC domain fails to improve the actual performance. Besides, existing sparse formats like CSR are not suitable for encoding the sparse feature maps because convolution operates across rows. In this work, we propose a new format and a novel sparse convolution algorithm to optimize sparse CNNs on GPUs. First, we design the Compressed Feature Map (CFM) format to store the sparse feature maps. Second, we propose an efficient sparse convolution algorithm called SPART with sparse weights and sparse feature maps. Finally, we optimize this algorithm on GPUs. Our experiments show that our SPART algorithm has good performance. Compared with dense convolution, the speedup of SPART is up to 2.62× (1.77× in average) on V100 and up to 1.84× (1.24× in average) on Titan X.
机译:CNN中密集的卷积计算和巨大的内存需求限制了它们的广泛部署和应用。尽管CNN中的权重图和特征图都可以是稀疏的,但是在HPC域中将稀疏卷积直接映射到spGEMM并不能提高实际性能。此外,现有的稀疏格式(例如CSR)不适合对稀疏特征图进行编码,因为卷积跨行进行。在这项工作中,我们提出了一种新的格式和一种新颖的稀疏卷积算法来优化GPU上的稀疏CNN。首先,我们设计压缩特征图(CFM)格式来存储稀疏特征图。其次,我们提出了一种有效的稀疏卷积算法,称为SPART,具有稀疏权重和稀疏特征图。最后,我们在GPU上优化此算法。我们的实验表明,我们的SPART算法具有良好的性能。与密集卷积相比,SPART在V100上的加速高达2.62倍(平均1.77倍),在Titan X上达到1.84倍(平均1.24倍)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号