首页> 外文期刊>Neurocomputing >Sparse low rank factorization for deep neural network compression
【24h】

Sparse low rank factorization for deep neural network compression

机译:深度神经网络压缩的稀疏低等级分解

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Storing and processing millions of parameters in deep neural networks is highly challenging during the deployment of model in real-time application on resource constrained devices. Popular low-rank approximation approach singular value decomposition (SVD) is generally applied to the weights of fully connected layers where compact storage is achieved by keeping only the most prominent components of the decomposed matrices. Years of research on pruning-based neural network model compression revealed that the relative importance or contribution of each neuron in a layer highly vary among each other. Recently, synapses pruning has also demonstrated that having sparse matrices in network architecture achieve lower space and faster computation during inference time. We extend these arguments by proposing that the low-rank decomposition of weight matrices should also consider significance of both input as well as output neurons of a layer. Combining the ideas of sparsity and existence of unequal contributions of neurons towards achieving the target, we propose sparse low rank (SLR) method which sparsifies SVD matrices to achieve better compression rate by keeping lower rank for unimportant neurons. We demonstrate the effectiveness of our method in compressing famous convolutional neural networks based image recognition frameworks which are trained on popular datasets. Experimental results show that the proposed approach SLR outperforms vanilla truncated SVD and a pruning baseline, achieving better compression rates with minimal or no loss in the accuracy. Code of the proposed approach is avaialble at https://github.com/sridarah/slr. (C) 2020 Elsevier B.V. All rights reserved.
机译:在资源受限设备上的实时应用中部署模型期间,在深神经网络中存储和处理数百万个参数在高度挑战。受欢迎的低秩近似方法奇异值分解(SVD)通常应用于通过保持分解矩阵的最突出的组件来实现紧凑的存储器的完全连接层的重量。基于修剪的神经网络模型压缩的多年的研究表明,每个神经元在层中的层中的相对重要性或贡献在彼此之间非常不同。最近,突触修剪还证明了网络架构中的稀疏矩阵在推理时间内实现了较低的空间和更快的计算。我们通过提出重量矩阵的低秩分解还应考虑两个输入的重要性以及层的输出神经元的重要性。结合稀疏性的思想和神经元不平等贡献朝向实现目标,我们提出了稀疏的低等级(SLR)方法,使SVD基质缩小以通过保持不重要神经元的较低等级来实现更好的压缩率。我们展示了我们在压缩着着名的基于卷积神经网络的图像识别框架中的方法,该图像识别框架在受欢迎的数据集上训练。实验结果表明,所提出的方法SLR优于Vanilla截短的SVD和修剪基线,实现更好的压缩速率,精度最小或没有损失。所提出的方法的代码是在https://github.com/sridarah/slr处的avaialble。 (c)2020 Elsevier B.v.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号