首页> 外国专利> OPTIMIZING LOW PRECISION INFERENCE MODELS FOR DEPLOYMENT OF DEEP NEURAL NETWORKS

OPTIMIZING LOW PRECISION INFERENCE MODELS FOR DEPLOYMENT OF DEEP NEURAL NETWORKS

机译:优化低精密推理模型,用于部署深神经网络

摘要

Systems, apparatuses and methods may provide technology for optimizing an inference neural network model that performs asymmetric quantization by generating a quantized neural network, wherein model weights of the neural network are quantized as signed integer values, and wherein an input layer of the neural network is configured to quantize input values as unsigned integer values, generating a weights accumulation table based on the quantized model weights and a kernel size for the neural network, and generating an output restoration function for an output layer of the neural network based on the weights accumulation table and the kernel size. The technology may also perform per-input channel quantization. The technology may also perform mixed-precision auto-tuning.
机译:系统,装置和方法可以提供用于优化通过生成量化神经网络执行不对称量化的推理神经网络模型的技术,其中神经网络的模型权重量化为符号的整数值,并且其中神经网络的输入层是 被配置为将输入值量化为无符号整数值,基于用于神经网络的量化模型权重和内核大小来生成权重累积表,并基于权重累积表生成用于神经网络的输出层的输出恢复功能 和内核大小。 该技术还可以执行每次输入信道量化。 该技术还可以执行混合精度自动调谐。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号