首页>
外国专利>
Method and apparatus for learning low-precision neural network that combines weight quantization and activation quantization
Method and apparatus for learning low-precision neural network that combines weight quantization and activation quantization
展开▼
机译:用于学习结合权重量化和激活量化的低精度神经网络的方法和装置
展开▼
页面导航
摘要
著录项
相似文献
摘要
A method is provided. The method includes selecting a neural network model, wherein the neural network model includes a plurality of layers, and wherein each of the plurality of layers includes weights and activations; modifying the neural network model by inserting a plurality of quantization layers within the neural network model; associating a cost function with the modified neural network model, wherein the cost function includes a first coefficient corresponding to a first regularization term, and wherein an initial value of the first coefficient is pre-defined; and training the modified neural network model to generate quantized weights for a layer by increasing the first coefficient until all weights are quantized and the first coefficient satisfies a pre-defined threshold, further including optimizing a weight scaling factor for the quantized weights and an activation scaling factor for quantized activations, and wherein the quantized weights are quantized using the optimized weight scaling factor.
展开▼