首页> 外国专利> DYNAMIC QUANTIZATION FOR DEEP NEURAL NETWORK INFERENCE SYSTEM AND METHOD

DYNAMIC QUANTIZATION FOR DEEP NEURAL NETWORK INFERENCE SYSTEM AND METHOD

机译:深层神经网络推理系统的动态量化

摘要

A method for dynamically quantizing feature maps of a received image. The method includes convolving an image based on a predicted maximum value, a predicted minimum value, trained kernel weights and the image data. The input data is quantized based on the predicted minimum value and predicted maximum value. The output of the convolution is computed into an accumulator and re-quantized. The re-quantized value is output to an external memory. The predicted min value and the predicted max value are computed based on the previous max values and min values with a weighted average or a pre-determined formula. Initial min value and max value are computed based on known quantization methods and utilized for initializing the predicted min value and predicted max value in the quantization process.
机译:一种用于动态量化接收图像的特征图的方法。该方法包括基于预测的最大值,预测的最小值,训练的核权重和图像数据对图像进行卷积。基于预测最小值和预测最大值对输入数据进行量化。卷积的输出被计算到累加器中并重新量化。重新量化的值输出到外部存储器。基于先前的最大值和具有加权平均值或预定公式的最小值来计算预测最小值和预测最大值。初始最小值和最大值是基于已知的量化方法计算的,并用于在量化过程中初始化预测的最小值和预测的最大值。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号