首页> 外文会议>Chinese Control Conference >Deep Neural Network Compression Method Based on Product Quantization
【24h】

Deep Neural Network Compression Method Based on Product Quantization

机译:基于产品量化的深度神经网络压缩方法

获取原文

摘要

In this paper a method based on the combination of product quantization and pruning to compress deep neural network with large size model and great amount of calculation is proposed. First of all, we use pruning to reduce redundant parameters in deep neural network, and then refine the tune network for fine tuning. Then we use product quantization to quantize the parameters of the neural network to 8 bits, which reduces the storage overhead so that the deep neural network can be deployed in embedded devices. For the classification tasks in the Mnist dataset and Cifar10 dataset, the network models such as LeNet5, AlexNet, ResNet are compressed to 23 to 38 times without losing accuracy as much as possible.
机译:提出了一种基于乘积量化与修剪相结合的深度神经网络压缩方法,该方法具有模型大,计算量大的特点。首先,我们使用修剪来减少深度神经网络中的冗余参数,然后优化调谐网络以进行微调。然后,我们使用乘积量化将神经网络的参数量化为8位,这减少了存储开销,因此深度神经网络可以部署在嵌入式设备中。对于Mnist数据集和Cifar10数据集中的分类任务,将诸如LeNet5,AlexNet,ResNet之类的网络模型压缩到23到38倍,而不会尽可能地降低准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号