首页> 外文会议>IEEE Computer Society Annual Symposium on VLSI >Towards Efficient Compact Network Training on Edge-Devices
【24h】

Towards Efficient Compact Network Training on Edge-Devices

机译:在边缘设备上进行高效的紧凑型网络培训

获取原文

摘要

Currently, there is a trend to deploy training on edge devices, which is crucial to future AI applications in various scenarios with transfer and online learning demands. Specifically, there may be a severe degradation of accuracy when directly deploying the trained models on edge devices, because the local environment forms an edge local dataset that is often different from the generic dataset. However, training on edge devices with limited computing and memory capability is a challenge problem. In this paper, we propose a novel quantization training framework for efficient compact network training on edge devices. Firstly, training-aware symmetric quantization is introduced to quantize all of the data types in the training process. Then, channel-wise quantization method is adopted for comapact network quantization, which has significantly high tolerance to quantization errors and can make the training process more stable. For further efficient training, we build a hardware evaluation platform to evaluate different settings of the network, so as to achieve a better trade-off among accuracy, energy and latency. Finally, we evaluate two widely used compact networks on a domain adaptation dataset for image classification, and the results demonstrate that the proposed methods can allow us achieve an improvement of 8.4 × -17.2× in energy reduction and 11.9 × -16.3× in latency reduction compared with 32-bit implementations, while maintaining the classification accuracy.
机译:当前,存在在边缘设备上部署培训的趋势,这对于将来在具有转移和在线学习需求的各种情况下的AI应用程序至关重要。特别地,当将训练后的模型直接部署在边缘设备上时,准确性可能会严重下降,因为局部环境形成了通常与通用数据集不同的边缘局部数据集。但是,在具有有限的计算和存储能力的边缘设备上进行培训是一个挑战性的问题。在本文中,我们提出了一种新颖的量化训练框架,用于在边缘设备上进行有效的紧凑型网络训练。首先,引入了可感知训练的对称量化,以量化训练过程中的所有数据类型。然后,采用信道级量化方法进行协同映射网络量化,该方法对量化误差具有很高的容忍度,可以使训练过程更加稳定。为了进一步进行有效的培训,我们构建了一个硬件评估平台来评估网络的不同设置,从而在准确性,能量和延迟之间取得更好的平衡。最后,我们在域自适应数据集上评估了两个广泛使用的紧凑型网络,以进行图像分类,结果表明,所提出的方法可以使我们在节能方面实现8.4×-17.2×的改进,在延迟方面减少11.9×-16.3×的改进与32位实现方案相比,同时保持了分类的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号