DNN Placement and Inference in Edge Computing

机译：边缘计算中的DNN放置和推理

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The deployment of deep neural network (DNN) models in software applications is increasing rapidly with the exponential growth of artificial intelligence. Currently, such models are deployed manually by developers in the cloud considering several user requirements, while the decision of model selection and user assignment is difficult to take. With the rise of edge computing paradigm, companies tend to deploy applications as close as possible to the user. Considering this system, the problem of DNN model selection and the inference serving becomes harder due to the introduction of communication latency between nodes. We present an automatic method for DNN placement and inference in edge computing; a mathematical formulation to the DNN Model Variant Selection and Placement (MVSP) problem is presented, it considers the inference latency of different model-variants, communication latency between nodes, and utilization cost of edge computing nodes. Furthermore, we propose a general heuristic algorithm to solve the MVSP problem. We provide an analysis of the effects of hardware sharing on inference latency, on an example of GPU edge computing nodes shared between different DNN model-variants. We evaluate our model numerically, and show the potentials of GPU sharing, with decreased average latency by 33% of millisecond-scale per request for low load, and by 21% for high load. We study the tradeoff between latency and cost and show the pareto optimal curves. Finally, we compare the optimal solution with the proposed heuristic and showed that the average latency per request increased by more than 60%. This can be improved using more efficient placement algorithms.

机译：随着人工智能的指数级增长，深层神经网络（DNN）模型在软件应用程序中的部署正在迅速增加。当前，考虑到多个用户需求，开发人员在云中手动部署此类模型，而很难做出模型选择和用户分配的决定。随着边缘计算范式的兴起，公司倾向于将应用程序部署在尽可能靠近用户的位置。考虑到该系统，由于节点之间的通信等待时间的引入，DNN模型选择和推理服务的问题变得更加困难。我们提出了一种在边缘计算中用于DNN放置和推理的自动方法;提出了DNN模型变量选择和放置（MVSP）问题的数学公式，其中考虑了不同模型变量的推理延迟，节点之间的通信延迟以及边缘计算节点的使用成本。此外，我们提出了一种通用的启发式算法来解决MVSP问题。我们以不同DNN模型变量之间共享的GPU边缘计算节点为例，分析了硬件共享对推理延迟的影响。我们通过数字方式评估我们的模型，并显示了GPU共享的潜力，对于低负载，平均延迟减少了每个请求毫秒级的33％，对于高负载则减少了21％。我们研究了延迟和成本之间的折衷，并显示了最优曲线。最后，我们将最佳解决方案与提议的启发式方法进行了比较，结果表明，每个请求的平均延迟增加了60％以上。使用更有效的放置算法可以改善这一点。

著录项

来源
《International Convention on Information, Communication and Electronic Technology》|2020年|479-484|共6页
会议地点
作者
Mounir Bensalem; Jasenka Dizdarević; Admela Jukan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. CoEdge: Cooperative DNN Inference With Adaptive Workload Partitioning Over Heterogeneous Edge Devices [J] . Zeng Liekang, Chen Xu, Zhou Zhi, IEEE/ACM Transactions on Networking . 2021,第2期

机译：CODEDE：合作DNN推断，具有在异构边缘设备上的自适应工作负载分区
2. Real Time Arrhythmia Monitoring and Classification Based on Edge Computing and DNN [J] . Mingxin Liu, Ningning Shao, Chaoxuan Zheng, Wireless communications & mobile computing . 2021,第a期

机译：基于边缘计算和DNN的实时心律失常监测和分类
3. Edge-Assisted Distributed DNN Collaborative Computing Approach for Mobile Web Augmented Reality in 5G Networks [J] . Pei Ren, Xiuquan Qiao, Yakun Huang, IEEE Network . 2020,第2期

机译：在5G网络中的移动网络增强现实的边缘辅助分布式DNN协作计算方法
4. Modeling of Deep Neural Network (DNN) Placement and Inference in Edge Computing [C] . Mounir Bensalem, Jasenka Dizdarevć, Admela Jukan IEEE International Conference on Communications Workshops . 2020

机译：边缘计算中的深度神经网络（DNN）放置和推理建模
5. Topology-Aware Job Scheduling and Placement in High Performance Computing and Edge Computing Systems [D] . Li, Kangkang. 2019

机译：高性能计算和边缘计算系统中的拓扑知识作业调度和放置
6. Dynamic Inference Approach Based on Rules Engine in Intelligent Edge Computing for Building Environment Control [O] . Wenquan Jin, Rongxu Xu, Sunhwan Lim, 2021

机译：基于智能边缘计算规则引擎的动态推理方法用于构建环境控制
7. Modeling of Deep Neural Network (DNN) Placement and Inference in Edge Computing [O] . Mounir Bensalem, Jasenka Dizdarevc, Admela Jukan 2020

机译：深度神经网络（DNN）放置和推断在边缘计算中的建模

DNN Placement and Inference in Edge Computing

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅