Bit-Quantized-Net: An Effective Method for Compressing Deep Neural Networks

Li Chunshan; Du Qing; Xu Xiaofei; Zhu Jinhui; Chu Dianhui

摘要

Deep neural networks have achieved state-of-the-art performances in wide range scenarios, such as natural language processing, object detection, image classification, speech recognition, etc. While showing impressive results across these machine learning tasks, neural network models still suffer from computational consuming and memory intensive for parameters training/storage on mobile service scenario. As a result, how to simplify models as well as accelerate neural networks are undoubtedly to be crucial research topic. To address this issue, in this paper, we propose "Bit-Quantized-Net"(BQ-Net), which can compress deep neural networks both at the training phase and testing inference. And, the model size can be reduced by compressing bit quantized weights. Specifically, for training or testing plain neural network model, it is running tens of millions of times of y=wx+b computations. In BQ-Net, however, model approximate the computation operation y = wx + b by y = sign(w)(x vertical bar w vertical bar) + b during forward propagation of neural networks. That is, BQ-Net trains the networks with bit quantized weights during forwarding propagation, while retaining the full precision weights for gradients accumulating during backward propagation. Finally, we apply Huffman coding to encode the bit shifting weights which compressed the model size in some way. Extensive experiments on three real data-sets (MNIST, CIFAR-10, SVHN) show that BQ-Net can achieve 10-14x model compressibility.

机译：深度神经网络在广泛的方案中实现了最先进的性能，例如自然语言处理，对象检测，图像分类，语音识别等。在这些机器学习任务中显示出令人印象深刻的结果，神经网络模型仍然受到影响从计算消费和内存密集型参数训练/存储在移动服务方案上的存储。结果，如何简化模型以及加速神经网络无疑是至关重要的研究主题。为了解决这个问题，本文提出了“位量化 - 净”（BQ-NET），可以在训练阶段和测试推断下压缩深神经网络。并且，通过压缩比特量化的权重可以减少模型尺寸。具体地，用于训练或测试普通神经网络模型，它正在运行数百万次Y = WX + B计算。然而，在BQ-NET中，模型近似于在神经网络的前进传播期间y =符号（W）（x 垂直条W垂直条）+ wx + b。也就是说，BQ-NET在转发传播期间用位量化权重的网络训练网络，同时保留在向后传播期间累积梯度的全精密权重。最后，我们将霍夫曼编码应用于以某种方式对模型大小进行编码的位转换权重。在三个真实数据集（MNIST，CIFAR-10，SVHN）上进行了广泛的实验，表明BQ-NET可以实现10-14倍的模型可压缩性。

Bit-Quantized-Net: An Effective Method for Compressing Deep Neural Networks

摘要

著录项

引文网络

相关主题

期刊订阅