首页> 外国专利> Self-Tuning Incremental Model Compression Solution in Deep Neural Network with Guaranteed Accuracy Performance

Self-Tuning Incremental Model Compression Solution in Deep Neural Network with Guaranteed Accuracy Performance

机译:保证精度性能的深层神经网络自调整增量模型压缩解决方案

摘要

A method of compressing a pre-trained deep neural network model includes inputting the pre-trained deep neural network model as a candidate model. The candidate model is compressed by increasing sparsity of the candidate, removing at least one batch normalization layer present in the candidate model, and quantizing all remaining weights into fixed-point representation to form a compressed model. Accuracy of the compressed model is then determined utilizing an end-user training and validation data set. Compression of the candidate model is repeated when the accuracy improves. Hyper parameters for compressing the candidate model are adjusted, then compression of the candidate model is repeated when the accuracy declines. The compressed model is output for inference utilization when the accuracy meets or exceeds the end-user performance metric and target.
机译:压缩预训练的深度神经网络模型的方法包括输入预训练的深度神经网络模型作为候选模型。通过增加候选者的稀疏性,删除候选者模型中存在的至少一个批处理归一化层,并将所有剩余权重量化为定点表示形式来压缩候选模型,以压缩模型。然后,利用最终用户的训练和验证数据集确定压缩模型的准确性。当精度提高时,将重复候选模型的压缩。调整用于压缩候选模型的超参数,然后在精度下降时重复对候选模型进行压缩。当精度达到或超过最终用户性能指标和目标时,将输出压缩的模型以进行推理。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号