首页> 外国专利> OPTIMIZING LOW PRECISION INFERENCE MODELS FOR DEPLOYMENT OF DEEP NEURAL NETWORKS

OPTIMIZING LOW PRECISION INFERENCE MODELS FOR DEPLOYMENT OF DEEP NEURAL NETWORKS

机译：优化低精密推理模型，用于部署深神经网络

页面导航

摘要
著录项
相似文献

摘要

Systems, apparatuses and methods may provide technology for optimizing an inference neural network model that performs asymmetric quantization by generating a quantized neural network, wherein model weights of the neural network are quantized as signed integer values, and wherein an input layer of the neural network is configured to quantize input values as unsigned integer values, generating a weights accumulation table based on the quantized model weights and a kernel size for the neural network, and generating an output restoration function for an output layer of the neural network based on the weights accumulation table and the kernel size. The technology may also perform per-input channel quantization. The technology may also perform mixed-precision auto-tuning.

机译：系统，装置和方法可以提供用于优化通过生成量化神经网络执行不对称量化的推理神经网络模型的技术，其中神经网络的模型权重量化为符号的整数值，并且其中神经网络的输入层是被配置为将输入值量化为无符号整数值，基于用于神经网络的量化模型权重和内核大小来生成权重累积表，并基于权重累积表生成用于神经网络的输出层的输出恢复功能和内核大小。该技术还可以执行每次输入信道量化。该技术还可以执行混合精度自动调谐。

著录项

公开/公告号WO2021179281A1

专利类型
公开/公告日2021-09-16

原文格式PDF
申请/专利权人 INTEL CORPORATION;GONG JIONG;WU YONG;SHEN HAIHAO;LIN XIAO DONG;ZHANG GUOMING;YUAN FENG;
展开▼

申请/专利号WO2020CN79161
发明设计人 GONG JIONG;WU YONG;SHEN HAIHAO;LIN XIAO DONG;ZHANG GUOMING;YUAN FENG;
展开▼

申请日2020-03-13
分类号G06N3/04;
国家 CN
入库时间 2022-08-24 21:08:24

相似文献

专利
外文文献
中文文献