首页> 外文会议>International Joint Conference on Neural Networks >EvoQ: Mixed Precision Quantization of DNNs via Sensitivity Guided Evolutionary Search
【24h】

EvoQ: Mixed Precision Quantization of DNNs via Sensitivity Guided Evolutionary Search

机译:EvoQ:通过灵敏度指导的进化搜索对DNN进行混合精度量化

获取原文

摘要

Network quantization can effectively reduce computation and memory costs without modifying network structures, facilitating the deployment of deep neural networks (DNNs) on edge devices. However, most of the existing methods usually need time-consuming training or fine-tuning and access to the original training dataset that may be unavailable due to privacy or security concerns. In this paper, we introduce a novel method named EvoQ that employs evolutionary search to achieve mixed precision quantization with limited data, which can optimize the resource allocation without adding computation consumption. Considering the shortage of samples and expensive search costs, we use 50 samples to measure the output difference between the quantization model and the pre-trained model for the evaluation of quantization policy, which can save the time obviously while maintaining high accuracy. To improve the search efficiency, we analyze the quantization sensitivity of each layer and utilize the results to optimize the mutation operation. At last, we calibrate the outputs and intermediate features of the quantization model using the selected 50 samples to improve the performance further. We implement extensive experiments on a diverse set of models, including ResNet18/50/101, SqueezeNet, ShuffleNetV2, and MobileNetV2 on ImageNet, as well as SSD-VGG and SSD-ResNet50 on PASCAL VOC. Our method can improve the performance apparently and outperforms the existing post-training quantization methods, demonstrating the effectiveness of EvoQ.
机译:网络量化可以有效地减少计算和内存成本,而无需修改网络结构,从而有助于在边缘设备上部署深度神经网络(DNN)。但是,大多数现有方法通常需要耗时的培训或微调,并且需要访问原始培训数据集,这可能是由于隐私或安全问题而无法使用的。在本文中,我们介绍了一种名为EvoQ的新方法,该方法利用进化搜索来实现有限数据的混合精度量化,从而可以在不增加计算消耗的情况下优化资源分配。考虑到样本的不足和昂贵的搜索成本,我们使用50个样本来测量量化模型与预训练模型之间的输出差异,以评估量化策略,从而可以在节省大量时间的同时保持较高的准确性。为了提高搜索效率,我们分析了每一层的量化敏感性,并利用结果来优化突变操作。最后,我们使用选定的50个样本来校准量化模型的输出和中间特征,以进一步提高性能。我们对各种模型进行了广泛的实验,包括ImageNet上的ResNet18 / 50/101,SqueezeNet,ShuffleNetV2和MobileNetV2,以及PASCAL VOC上的SSD-VGG和SSD-ResNet50。我们的方法可以明显改善性能,并且优于现有的训练后量化方法,证明了EvoQ的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号