Optimizing deep learning inference on mobile devices with neural network accelerators

Zeng Xi; Xu Yunlong; Zhi Tian

首页> 外文期刊>高技术通讯（英文版） >Optimizing deep learning inference on mobile devices with neural network accelerators

【24h】

Optimizing deep learning inference on mobile devices with neural network accelerators

机译：使用神经网络加速器优化移动设备上的深度学习推理

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相关主题

摘要

Deep learning has now been widely used in intelligent apps of mobile devices.In pursuit of ultra-low power and latency,integrating neural network accelerators(NNA)to mobile phones has become a trend.However,conventional deep learning programming frameworks are not well-developed to support such devices,leading to low computing efficiency and high memory-occupation.To address this problem,a 2-stage pipeline is proposed for optimizing deep learning model inference on mobile devices with NNAs in terms of both speed and memory-footprint.The 1 st stage reduces computation workload via graph optimization,including splitting and merging nodes.The 2 nd stage goes further by optimizing at compilation level,including kernel fusion and in-advance compilation.The proposed optimizations on a commercial mobile phone with an NNA is evaluated.The experimental results show that the proposed approaches achieve 2.8×to 26×speed up,and reduce the memory-footprint by up to 75%.

著录项

来源
《高技术通讯（英文版）》 |2019年第4期|417-425|共9页
作者
Zeng Xi; Xu Yunlong; Zhi Tian;
展开▼
作者单位

Intelligent Processor Research Center Institute of Computing Technology Chinese Academy of Sciences Beijing 100190 P.R.China;

University of Chinese Academy of Sciences Beijing 100049 P.R.China;

Cambricon Technologies Corporation Limited Beijing 100191 P.R.China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
入库时间 2022-08-19 04:31:36

Optimizing deep learning inference on mobile devices with neural network accelerators

摘要

著录项

相关主题

期刊订阅