首页> 外文会议>International Symposium on Advanced Parallel Processing Technologies >SPART: Optimizing CNNs by Utilizing Both Sparsity of Weights and Feature Maps
【24h】

SPART: Optimizing CNNs by Utilizing Both Sparsity of Weights and Feature Maps

机译:spart:利用权重和特征映射的稀疏性优化CNN

获取原文

摘要

Intense convolution computation and great memory requirement in CNNs constraint their wider deployments and applications. Although both the weights and feature maps in CNNs can be sparse, directly mapping sparse convolution to spGEMM in HPC domain fails to improve the actual performance. Besides, existing sparse formats like CSR are not suitable for encoding the sparse feature maps because convolution operates across rows. In this work, we propose a new format and a novel sparse convolution algorithm to optimize sparse CNNs on GPUs. First, we design the Compressed Feature Map (CFM) format to store the sparse feature maps. Second, we propose an efficient sparse convolution algorithm called SPART with sparse weights and sparse feature maps. Finally, we optimize this algorithm on GPUs. Our experiments show that our SPART algorithm has good performance. Compared with dense convolution, the speedup of SPART is up to 2.62× (1.77× in average) on V100 and up to 1.84× (1.24× in average) on Titan X.
机译:CNNS约束中强大的卷积计算和巨大的内存要求他们更广泛的部署和应用程序。虽然CNN中的权重和特征映射都可以稀疏,但直接将稀疏卷积映射到HPC域中的SPGEMM无法提高实际性能。此外,CSR等现有的稀疏格式不适合编码稀疏特征映射,因为卷积横跨行运行。在这项工作中,我们提出了一种新的格式和新的稀疏卷积算法,可以在GPU上优化稀疏CNN。首先,我们设计压缩的特征映射(CFM)格式来存储稀疏功能映射。其次,我们提出了一种称为Spart的有效稀疏卷积算法,具有稀疏权重和稀疏特征映射。最后,我们在GPU上优化该算法。我们的实验表明,我们的Spart算法具有良好的性能。与密集卷积相比,在V100上的SPART的加速度高达2.62倍(1.77×(1.77×(1.24×平均平均1.24×平均平均)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号