Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

机译：神经网络的量化和训练，以便进行有效的仅整数运算

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The rising popularity of intelligent mobile devices and the daunting computational cost of deep learning-based models call for efficient and accurate on-device inference schemes. We propose a quantization scheme that allows inference to be carried out using integer-only arithmetic, which can be implemented more efficiently than floating point inference on commonly available integer-only hardware. We also co-design a training procedure to preserve end-to-end model accuracy post quantization. As a result, the proposed quantization scheme improves the tradeoff between accuracy and on-device latency. The improvements are significant even on MobileNets, a model family known for run-time efficiency, and are demonstrated in ImageNet classification and COCO detection on popular CPUs.

机译：智能移动设备的日益普及以及基于深度学习的模型的艰巨计算成本要求高效且准确的设备上推理方案。我们提出了一种量化方案，该方案允许使用仅整数算法进行推理，该算法比在通常可用的仅整数硬件上的浮点推理更有效地实现。我们还共同设计了一种训练程序，以保持量化后的端到端模型准确性。结果，所提出的量化方案改善了准确性与设备上等待时间之间的折衷。即使在以运行时效率闻名的模型系列MobileNets上，这些改进也非常重要，并且在ImageNet分类和流行CPU的COCO检测中得到了证明。

著录项

来源
《IEEE/CVF Conference on Computer Vision and Pattern Recognition》|2018年|2704-2713|共10页
会议地点 Salt Lake City(US)
作者
Benoit Jacob; Skirmantas Kligys; Bo Chen; Menglong Zhu; Matthew Tang; Andrew Howard; Hartwig Adam; Dmitry Kalenichenko;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Quantization (signal); Training; Arrays; Computational modeling; Hardware; Neural networks;

机译：量化（信号）；训练;数组；计算建模；硬件;神经网络;

相似文献

外文文献
中文文献
专利

1. Efficient spiking neural network training and inference with reduced precision memory and computing [J] . Wang Yi, Shahbazi Karim, Zhang Hao, Computers & Digital Techniques, IET . 2019,第5期

机译：高效的尖峰神经网络训练和推理，并减少了精确的内存和计算
2. Efficient Weights Quantization of Convolutional Neural Networks Using Kernel Density Estimation based Non-uniform Quantizer [J] . Sanghyun Seo, Juntae Kim Applied Sciences . 2019,第12期

机译：基于基于核密度估计的非统一量化器的高效权重量化卷积神经网络
3. Balanced Quantization: An Effective and Efficient Approach to Quantized Neural Networks [J] . Shu-Chang Zhou, Yu-Zhi Wang, He Wen, 计算机科学技术学报（英文版） . 2017,第004期

机译：平衡量化：一种有效的量化神经网络方法
4. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference [C] . Benoit Jacob, Skirmantas Kligys, Bo Chen, IEEE/CVF Conference on Computer Vision and Pattern Recognition . 2018

机译：神经网络的量化与训练高效整数算术推断
5. Algorithmic Techniques Towards Efficient Quantization of Deep Neural Networks [D] . ?Youssef, Ahmed 2020

机译：深神经网络有效量化的算法技术
6. Efficient probabilistic inference in generic neural networks trained with non-probabilistic feedback [O] . A. Emin Orhan, Wei Ji Ma -1

机译：经过非概率反馈训练的通用神经网络中的有效概率推理
7. Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference [O] . Benoit Jacob, Skirmantas Kligys, Bo Chen, 2018

机译：神经网络的量化与训练高效整数算术推断

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅