...
首页> 外文期刊>Neurocomputing >GenExp: Multi-objective pruning for deep neural network based on genetic algorithm
【24h】

GenExp: Multi-objective pruning for deep neural network based on genetic algorithm

机译:基于遗传算法的深神经网络多目标修剪

获取原文
获取原文并翻译 | 示例
           

摘要

Unstructured deep neural network (DNN) pruning have been widely studied. However, previous schemes only focused upon compressing the model & rsquo;s memory footprint, which had led to relatively low reduction ratio in computational workload. This study demonstrates that the main reason behind is the inconsistent distribution of memory footprint and workload of the DNN model among different layers. Based on this observation, we propose to map the network pruning flow as a multi-objective optimization problem and design an improved genetic algorithm, which can efficiently explore the whole pruning structure space with both pruning goals equally constrained, to find the suitable solution that strikes a judicious balance between the DNN & rsquo;s model size and workload. Experiments show that the proposed scheme can achieve up to 34% further reduction on the model & rsquo;s computational workload compared to the state-of-the-art pruning scheme [11,33] for ResNet50 on the ILSVRC-2012 dataset. We have also deployed the pruned ResNet50 models on a dedicated DNN accelerator, and the measured data have shown a considerable 6x reduction in inference time compared to FPGA accelerator implementing dense CNN model quantized in INT8 format, and a 2:27x improvement in power efficiency over 2080Ti GPUbased implementations, respectively.(c) 2021 Elsevier B.V. All rights reserved.
机译:未受结构的深神经网络(DNN)修剪已被广泛研究。然而,以前的计划仅关注压缩模型和rsquo; S内存占据情况,这导致计算工作量中的减少比率相对较低。本研究表明,后面的主要原因是不同层之间的内存占用和DNN模型的工作量的不一致分布。基于该观察,我们建议将网络修剪流作为多目标优化问题,并设计一种改进的遗传算法,可以有效地探索整个修剪结构空间,其中普通目标同样受到限制,找到罢工的合适解决方案DNN和RSQUO;模型大小和工作量之间的明智平衡。实验表明,与ILSVRC-2012数据集上的ResET50的最先进的修剪方案[11,33]相比,拟议方案可实现高达34%的进一步减少计算工作负载。我们还在专用DNN加速器上部署了修剪的Reset50模型,并且测量的数据显示了与INT8格式量量化的致密CNN模型相比的推理时间相当于6倍的推理时间,以及功率效率的2:27x提高分别为2080TI GPUBASED实现。(c)2021 Elsevier BV保留所有权利。

著录项

  • 来源
    《Neurocomputing》 |2021年第3期|81-94|共14页
  • 作者单位

    Beijing Jiaotong Univ Inst Informat Sci Beijing 100044 Peoples R China|Beijing Key Lab Adv Informat Sci & Network Techno Beijing 100044 Peoples R China;

    Beijing Jiaotong Univ Inst Informat Sci Beijing 100044 Peoples R China|Beijing Key Lab Adv Informat Sci & Network Techno Beijing 100044 Peoples R China;

    Beijing Jiaotong Univ Inst Informat Sci Beijing 100044 Peoples R China|Beijing Key Lab Adv Informat Sci & Network Techno Beijing 100044 Peoples R China;

    Kwai Inc Heterogeneous Comp Grp Palo Alto CA 94306 USA;

    Kwai Inc Heterogeneous Comp Grp Palo Alto CA 94306 USA;

    Beijing Jiaotong Univ Inst Informat Sci Beijing 100044 Peoples R China|Beijing Key Lab Adv Informat Sci & Network Techno Beijing 100044 Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Efficient inference; Network compression; Weights pruning; Genetic algorithm; Convolutional neural networks;

    机译:高效推断;网络压缩;重量修剪;遗传算法;卷积神经网络;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号