首页> 外文会议>IEEE/CVF Conference on Computer Vision and Pattern Recognition >Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference
【24h】

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

机译:神经网络的量化和训练,以便进行有效的仅整数运算

获取原文

摘要

The rising popularity of intelligent mobile devices and the daunting computational cost of deep learning-based models call for efficient and accurate on-device inference schemes. We propose a quantization scheme that allows inference to be carried out using integer-only arithmetic, which can be implemented more efficiently than floating point inference on commonly available integer-only hardware. We also co-design a training procedure to preserve end-to-end model accuracy post quantization. As a result, the proposed quantization scheme improves the tradeoff between accuracy and on-device latency. The improvements are significant even on MobileNets, a model family known for run-time efficiency, and are demonstrated in ImageNet classification and COCO detection on popular CPUs.
机译:智能移动设备的日益普及以及基于深度学习的模型的艰巨计算成本要求高效且准确的设备上推理方案。我们提出了一种量化方案,该方案允许使用仅整数算法进行推理,该算法比在通常可用的仅整数硬件上的浮点推理更有效地实现。我们还共同设计了一种训练程序,以保持量化后的端到端模型准确性。结果,所提出的量化方案改善了准确性与设备上等待时间之间的折衷。即使在以运行时效率闻名的模型系列MobileNets上,这些改进也非常重要,并且在ImageNet分类和流行CPU的COCO检测中得到了证明。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号