首页> 外文期刊>Mobile networks & applications >Bit-Quantized-Net: An Effective Method for Compressing Deep Neural Networks
【24h】

Bit-Quantized-Net: An Effective Method for Compressing Deep Neural Networks

机译:比特量化网络:压缩深神经网络的有效方法

获取原文
获取原文并翻译 | 示例

摘要

Deep neural networks have achieved state-of-the-art performances in wide range scenarios, such as natural language processing, object detection, image classification, speech recognition, etc. While showing impressive results across these machine learning tasks, neural network models still suffer from computational consuming and memory intensive for parameters training/storage on mobile service scenario. As a result, how to simplify models as well as accelerate neural networks are undoubtedly to be crucial research topic. To address this issue, in this paper, we propose "Bit-Quantized-Net"(BQ-Net), which can compress deep neural networks both at the training phase and testing inference. And, the model size can be reduced by compressing bit quantized weights. Specifically, for training or testing plain neural network model, it is running tens of millions of times of y=wx+b computations. In BQ-Net, however, model approximate the computation operation y = wx + b by y = sign(w)(x vertical bar w vertical bar) + b during forward propagation of neural networks. That is, BQ-Net trains the networks with bit quantized weights during forwarding propagation, while retaining the full precision weights for gradients accumulating during backward propagation. Finally, we apply Huffman coding to encode the bit shifting weights which compressed the model size in some way. Extensive experiments on three real data-sets (MNIST, CIFAR-10, SVHN) show that BQ-Net can achieve 10-14x model compressibility.
机译:深度神经网络在广泛的方案中实现了最先进的性能,例如自然语言处理,对象检测,图像分类,语音识别等。在这些机器学习任务中显示出令人印象深刻的结果,神经网络模型仍然受到影响从计算消费和内存密集型参数训练/存储在移动服务方案上的存储。结果,如何简化模型以及加速神经网络无疑是至关重要的研究主题。为了解决这个问题,本文提出了“位量化 - 净”(BQ-NET),可以在训练阶段和测试推断下压缩深神经网络。并且,通过压缩比特量化的权重可以减少模型尺寸。具体地,用于训练或测试普通神经网络模型,它正在运行数百万次Y = WX + B计算。然而,在BQ-NET中,模型近似于在神经网络的前进传播期间y =符号(W)(x 垂直条W垂直条)+ wx + b。也就是说,BQ-NET在转发传播期间用位量化权重的网络训练网络,同时保留在向后传播期间累积梯度的全精密权重。最后,我们将霍夫曼编码应用于以某种方式对模型大小进行编码的位转换权重。在三个真实数据集(MNIST,CIFAR-10,SVHN)上进行了广泛的实验,表明BQ-NET可以实现10-14倍的模型可压缩性。

著录项

  • 来源
    《Mobile networks & applications》 |2021年第1期|104-113|共10页
  • 作者单位

    Harbin Inst Technol Dept Comp Sci & Technol Harbin Peoples R China;

    South China Univ Technol Sch Software Engn Guangzhou Peoples R China;

    Harbin Inst Technol Dept Comp Sci & Technol Harbin Peoples R China;

    South China Univ Technol Sch Software Engn Guangzhou Peoples R China;

    Harbin Inst Technol Dept Comp Sci & Technol Harbin Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号