首页> 外文会议>IEEE International Conference on Distributed Computing Systems >PerDNN: Offloading Deep Neural Network Computations to Pervasive Edge Servers
【24h】

PerDNN: Offloading Deep Neural Network Computations to Pervasive Edge Servers

机译:perdnn:将深度神经网络计算卸载到普及边缘服务器

获取原文

摘要

Emerging mobile applications, such as cognitive assistance based on deep neural network (DNN), require low latency as well as high computation power. To meet these requirements, edge computing (also called fog computing) has been proposed, which offloads computations to edge servers located near mobile clients. This paradigm shift from cloud to edge requires new computing infrastructure where edge servers are pervasively distributed over a region.This paper presents PerDNN, a system that executes DNNs of mobile clients collaboratively with pervasive edge servers. PerDNN dynamically partitions DNN computation between a client and an edge server to minimize execution latency. It predicts the next edge server the client will visit, calculates a speculative partitioning plan, and transfers the server-side DNN layers to the predicted server in advance, which reduces the initialization overhead needed to start offloading, thus avoiding cold starts. We do not incur excessive network traffic between edge servers, though, by migrating only a tiny fraction of the server-side DNN layers with negligible performance loss. We also use GPU statistics of edge servers for DNN partitioning to deal with the resource contention caused by multi-client offloading. In the simulation with human trace datasets and execution profile of real hardware, PerDNN reduced the occurrence of cold starts by up to 90%, achieving 58% higher throughput when clients change their offloading servers, compared to a baseline without proactive DNN transmission.
机译:新兴的移动应用,例如基于深神经网络(DNN)的认知辅助,需要低延迟以及高计算能力。为了满足这些要求,已经提出了边缘计算(也称为雾计算),将计算卸载到位于移动客户端附近的边缘服务器。从云到EDGE的这种范式转移需要新的计算基础架构,其中边缘服务器普遍分布在一个区域上。这篇论文介绍了Perdnn,一个系统与普遍的边缘服务器协同地执行移动客户端的DNN。 perdnn动态分区客户端和边缘服务器之间的DNN计算,以最大限度地减少执行延迟。它预测客户端将访问的下一个边缘服务器,计算推测分区计划,并预先将服务器端DNN层传送到预测服务器,这减少了开始卸载所需的初始化开销,从而避免冷启动。然而,我们在边缘服务器之间不会产生过度的网络流量,通过仅迁移服务器端DNN层的小部分,性能损失可忽略不计。我们还使用GPU统计器的边缘服务器进行DNN分区,以处理由多客户卸载引起的资源争用。在模拟人类跟踪数据集和真实硬件的执行配置文件,PerDNN最多减少冷启动的发生,90%,实现更高的客户的时候改变自己的卸载服务器,相对于不主动DNN传输的基线吞吐量58%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号