首页> 外文会议>IEEE International Conference on Multimedia and Expo >Efficient Implementation of Convolutional Neural Networks with End to End Integer-Only Dataflow
【24h】

Efficient Implementation of Convolutional Neural Networks with End to End Integer-Only Dataflow

机译:具有端到端纯整数数据流的卷积神经网络的高效实现

获取原文

摘要

Linear INT8 quantization is presented to construct an end to end integer-only dataflow for efficient inference of modern CNNs. The INT8 method is implemented with unified layer representation, thus quantized CNNs can be partitioned into computation subgraphs consisting of stacked unified layers with simplified integer-only arithmetic flow and scaling back mechanism, indicating high effectiveness for specific hardware realization. Experimental results show that both the classification and object detection models quantized by proposed INT8 method suffer approximate 1% accuracy loss, exhibiting comparable results with TensorRT. As a result, the deep learning accelerator (DLA) with integer-only dataflow and efficient memory hierarchy is designed for CNN applications.
机译:提出了线性INT8量化以构建端到端整数数据流,用于高效推论现代CNN。 INT8方法用统一的层表示来实现,因此量化的CNN可以被划分为由堆叠的统一层的计算子图,其具有简化的整数算术流程和缩放反馈机制,指示特定硬件实现的高效率。实验结果表明,通过提出的INT8方法量化的分类和对象检测模型均近似的1%的精度损耗,表现出与RENSORT的可比结果。因此,设计了具有整数DataFlow和有效内存层级的深度学习加速器(DLA)用于CNN应用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号