Parallel Blockwise Knowledge Distillation for Deep Neural Network Compression

Blakeney Cody; Li Xiaomin; Yan Yan; Zong Ziliang

首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >Parallel Blockwise Knowledge Distillation for Deep Neural Network Compression

【24h】

Parallel Blockwise Knowledge Distillation for Deep Neural Network Compression

机译：深度神经网络压缩的平行群体知识蒸馏

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Deep neural networks (DNNs) have been extremely successful in solving many challenging AI tasks in natural language processing, speech recognition, and computer vision nowadays. However, DNNs are typically computation intensive, memory demanding, and power hungry, which significantly limits their usage on platforms with constrained resources. Therefore, a variety of compression techniques (e.g., quantization, pruning, and knowledge distillation) have been proposed to reduce the size and power consumption of DNNs. Blockwise knowledge distillation is one of the compression techniques that can effectively reduce the size of a highly complex DNN. However, it is not widely adopted due to its long training time. In this article, we propose a novel parallel blockwise distillation algorithm to accelerate the distillation process of sophisticated DNNs. Our algorithm leverages local information to conduct independent blockwise distillation, utilizes depthwise separable layers as the efficient replacement block architecture, and properly addresses limiting factors (e.g., dependency, synchronization, and load balancing) that affect parallelism. The experimental results running on an AMD server with four Geforce RTX 2080Ti GPUs show that our algorithm can achieve 3x speedup plus 19 percent energy savings on VGG distillation, and 3.5x speedup plus 29 percent energy savings on ResNet distillation, both with negligible accuracy loss. The speedup of ResNet distillation can be further improved to 3.87 when using four RTX6000 GPUs in a distributed cluster.

机译：深度神经网络（DNN）在立即解决了许多挑战性的AI任务时，他现在已经在自然语言处理，语音识别和计算机视觉中解决了许多挑战性的AI任务。然而，DNN通常是计算密集型，记忆要求和电力饥饿，这显着限制了对具有受约束资源的平台的使用。因此，已经提出了各种压缩技术（例如，量化，修剪和知识蒸馏）以降低DNN的尺寸和功耗。块状知识蒸馏是可以有效地降低高度复杂的DNN尺寸的压缩技术之一。然而，由于其长期培训时间，它没有被广泛采用。在本文中，我们提出了一种新的平行植被蒸馏算法，以加速复杂DNN的蒸馏过程。我们的算法利用局部信息进行独立的块蒸馏，利用深度可分离层作为有效的替代块架构，并正确地解决了影响并行性的限制因素（例如，依赖性，同步和负载平衡）。具有四个GeForce RTX 2080TI GPU的AMD服务器上运行的实验结果表明，我们的算法可以实现3倍的快速加上VGG蒸馏的节能加上19％的节能，3.5倍的加速度加上Resnet蒸馏节省的29％，具有可忽略的准确性损失。当在分布式簇中使用四个RTX6000 GPU时，RESET蒸馏的加速可以进一步提高到3.87。

著录项

来源
《IEEE Transactions on Parallel and Distributed Systems》 |2021年第7期|1765-1776|共12页
作者
Blakeney Cody; Li Xiaomin; Yan Yan; Zong Ziliang;
展开▼
作者单位

Texas State Univ Dept Comp Sci San Marcos TX 78666 USA;

Texas State Univ Dept Comp Sci San Marcos TX 78666 USA;

Texas State Univ Dept Comp Sci San Marcos TX 78666 USA;

Texas State Univ Dept Comp Sci San Marcos TX 78666 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Computational modeling; Training; Task analysis; Quantization (signal); Neural networks; Deep learning; Hardware; Deep neural networks; model compression; knowledge distillation; parallel training;

机译：计算建模;任务分析;量化（信号）;神经网络;深度学习;硬件;深度神经网络;模型压缩;知识蒸馏;并行训练;

相似文献

外文文献
中文文献
专利

1. Exploring compression and parallelization techniques for distribution of deep neural networks over Edge-Fog continuum - a review [J] . Azra Nazir, Roohie Naaz Mir, Shaima Qureshi International Journal of Intelligent Computing and Cybernetics . 2020,第3期

机译：探索边缘神经网络探讨辉辉连续体的深度神经网络的压缩和并行化技术 - 评论
2. Deep Neural Network Compression by In-Parallel Pruning-Quantization [J] . IEEE Transactions on Pattern Analysis and Machine Intelligence . 2020,第3期

机译：通过并行修剪量化进行深度神经网络压缩
3. A deep neural network compression algorithm based on knowledge transfer for edge devices [J] . Chen Yanming, Li Chao, Gong Luqi, Computer Communications . 2020,第Nova期

机译：基于边缘设备知识传输的深度神经网络压缩算法
4. Progressive Blockwise Knowledge Distillation for Neural Network Acceleration [C] . Hui Wang, Hanbin Zhao, Xi Li, International Joint Conference on Artificial Intelligence . 2018

机译：神经网络加速度的渐进块知识蒸馏
5. Compressing Deep Neural Networks via Knowledge Distillation [D] . Kulshrestha, Ankit. 2019

机译：通过知识蒸馏压缩深神经网络
6. Explaining Neural Networks Using Attentive Knowledge Distillation [O] . Hyeonseok Lee, Sungchan Kim 2021

机译：使用细心知识蒸馏解释神经网络
7. Parallel Blockwise Knowledge Distillation for Deep Neural Network Compression [O] . Cody Blakeney, Xiaomin Li, Yan Yan, 2021

机译：深度神经网络压缩的平行群体知识蒸馏

Parallel Blockwise Knowledge Distillation for Deep Neural Network Compression

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅