首页> 外文会议>IEEE/CVF Conference on Computer Vision and Pattern Recognition >Towards Effective Low-bitwidth Convolutional Neural Networks
【24h】

Towards Effective Low-bitwidth Convolutional Neural Networks

机译:朝向有效的低位卷积神经网络

获取原文

摘要

This paper tackles the problem of training a deep convolutional neural network with both low-precision weights and low-bitwidth activations. Optimizing a low-precision network is very challenging since the training process can easily get trapped in a poor local minima, which results in substantial accuracy loss. To mitigate this problem, we propose three simple-yet-effective approaches to improve the network training. First, we propose to use a two-stage optimization strategy to progressively find good local minima. Specifically, we propose to first optimize a net with quantized weights and then quantized activations. This is in contrast to the traditional methods which optimize them simultaneously. Second, following a similar spirit of the first method, we propose another progressive optimization approach which progressively decreases the bit-width from high-precision to low-precision during the course of training. Third, we adopt a novel learning scheme to jointly train a full-precision model alongside the low-precision one. By doing so, the full-precision model provides hints to guide the low-precision model training. Extensive experiments on various datasets (i.e., CIFAR-100 and ImageNet) show the effectiveness of the proposed methods. To highlight, using our methods to train a 4-bit precision network leads to no performance decrease in comparison with its full-precision counterpart with standard network architectures (i.e., AlexNet and ResNet-50).
机译:本文用低精度权重和低比特宽度激活训练深度卷积神经网络的问题。优化低精度网络非常具有挑战性,因为培训过程很容易被困在众多局部最小值中,这导致了实质性的精度损失。为缓解此问题,我们提出了三种简单尚有效的方法来改善网络培训。首先,我们建议使用两级优化策略来逐步找到良好的当地最小值。具体地,我们建议首先优化具有量化权重的网络,然后是量化的激活。这与传统方法相反,传统方法同时优化它们。其次,遵循类似于第一种方法的精神,我们提出了另一种渐进的优化方法,在训练过程中,逐步降低了从高精度到低精度的钻头宽度。第三,我们采用一种新颖的学习计划,共同培训一个全精度模型,与低精度一致。通过这样做,全精密型号提供了引导低精度模型培训的提示。在各种数据集(即CiFar-100和ImageNet)上进行了广泛的实验,显示了所提出的方法的有效性。要突出显示,使用我们培训4位精密网络的方法,与标准网络架构(即亚历纳网和resnet-50)的全精密对应物相比,无需性能下降。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号