<inline-formula><tex-math notation='LaTeX'>$run$</tex-math> <alternatives><mml:math xmlns:mml='http://www.w3.org/1998/Math/MathML'><mml:mrow><mml:mi>r</mml:mi><mml:mi>u</mml:mi><mml:mi>n</mml:mi></mml:mrow></mml:math><inline-graphic xlink:href='qian-ieq1-3086274.gif' xmlns:xlink='http://www.w3.org/1999/xlink'/></alternatives></inline-formula>Data: Re-Distributing Data via Piggybacking for Geo-Distributed Data Analytics Over Edges

Jin Yibo; Qian Zhuzhong; Guo Song; Zhang Sheng; Jiao Lei; Lu Sanglu

首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >

$run$

run

Data: Re-Distributing Data via Piggybacking for Geo-Distributed Data Analytics Over Edges

【24h】

$run$ runData: Re-Distributing Data via Piggybacking for Geo-Distributed Data Analytics Over Edges

机译： $运行$ <替代方案> R U N 数据：通过捎带重新分发数据，用于通过边缘进行地理分布数据分析

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Efficiently analyzing geo-distributed datasets is emerging as a major demand in a cloud-edge system. Since the datasets are often generated in closer proximity to end users, traditional works mainly focus on offloading proper tasks from those hotspot edges to the datacenter to decrease the overall completion time of submitted jobs in a one-shot manner. However, optimizing the completion time of current job alone is insufficient in a long-term scope since some datasets would be used multiple times. Instead, optimizing the data distribution is much more efficient and could directly benefit forthcoming jobs, although it may postpone the execution of current one. Unfortunately, due to the throwaway feature of data fetcher, existing data analytics systems fail to re-distribute corresponding data out of hotspot edges after the execution of data analytics. In order to minimize the overall completion time for a sequence of jobs as well as to guarantee the performance of current one, we propose to re-distribute the data along with task offloading, and formulate corresponding epsilon-bounded data-driven task scheduling problem over wide area network under the consideration of edge heterogeneity. We design an online schema runData, which offloads proper tasks and related data via piggybacking to the datacenter based on delicately calculated probabilities. Through rigorous theoretical analysis, runData is proved concentrated on its optimum with high probability. We implement runData based on Spark and HDFS. Both testbed results and trace-driven simulations show that runData re-distributes proper data via piggybacking and achieves up to 37 percent reduction on average response time compared with state-of-the-art schemas.

机译：有效地分析地理分布式数据集是在云边缘系统中的主要需求中出现的。由于数据集通常在靠近最终用户的接近时生成，因此传统的作品主要关注从那些热点边缘到数据中心卸载适当的任务，以减少以一拍方式减少提交作业的总体完成时间。但是，在长期范围内，单独优化当前作业的完成时间不足，因为某些数据集将多次使用。相反，优化数据分布更有效，并且可以直接受益于即将到来的作业，尽管它可能会推迟执行当前的作业。遗憾的是，由于数据获取器的一次性特征，现有数据分析系统未能在执行数据分析后重新分配热点边缘的相应数据。为了最小化一系列作业的整体完成时间以及保证当前的性能，我们建议将数据与任务卸载一起分发，并配制相应的epsilon限定数据驱动的任务调度问题思考边缘异质性的广域网。我们设计了一个在线模式Rundata，它通过基于精致计算的概率来捎带到数据中心来卸载正确的任务和相关数据。通过严格的理论分析，证明Rundata集中在最佳概率上。我们根据Spark和HDFS实现Rundata。测试平均结果和跟踪驱动模拟都表明，与最先进的模式相比，rundata通过搭扣重新分配适当的数据，并在平均响应时间上降低高达37％。

著录项

来源
《IEEE Transactions on Parallel and Distributed Systems》 |2022年第1期|40-55|共16页
作者
Jin Yibo; Qian Zhuzhong; Guo Song; Zhang Sheng; Jiao Lei; Lu Sanglu;
展开▼
作者单位

Nanjing Univ Dept Comp Sci & Technol State Key Lab Novel Software Technol Nanjing 210023 Jiangsu Peoples R China;

Nanjing Univ Dept Comp Sci & Technol State Key Lab Novel Software Technol Nanjing 210023 Jiangsu Peoples R China;

Hong Kong Polytech Univ Dept Comp Hung Hom Hong Kong Peoples R China|Hong Kong Polytech Univ Shenzhen Res Inst Shenzhen 515063 Guangdong Peoples R China;

Nanjing Univ Dept Comp Sci & Technol State Key Lab Novel Software Technol Nanjing 210023 Jiangsu Peoples R China;

Univ Oregon Dept Comp & Informat Sci Eugene OR 97403 USA;

Nanjing Univ Dept Comp Sci & Technol State Key Lab Novel Software Technol Nanjing 210023 Jiangsu Peoples R China;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Task analysis; Data analysis; Wide area networks; Data models; Servers; Optimization; Videos; Cloud-edge system; data re-distribution; heterogeneity; online schema;

机译：任务分析;数据分析;广域网;数据模型;服务器;优化;视频;云边缘系统;数据重新分配;异质性;在线模式;在线模式;

相似文献

外文文献

1. On the effective capacity of Fisher–Snedecor${bfcal F}$Ffading channels [J] . F.S. Almehmadi, O.S. Badarneh Electronics Letters . 2018,第18期

机译：关于Fisher–Snedecor $ {bfcal F} $ F 衰减通道
2. Secure $k$k-NN Query on Encrypted Cloud Data with Multiple Keys [J] . Cheng Ke, Wang Liangmin, Shen Yulong, Big Data, IEEE Transactions on . 2021,第4期

机译：secure $ k $ $ <替代品> k < / renternativings>/inline-formula>-nn查询加密云数据与多个键
3. $mu$μVulDeePecker: A Deep Learning-Based System for Multiclass Vulnerability Detection [J] . Zou Deqing, Wang Sujuan, Xu Shouhuai, IEEE transactions on dependable and secure computing . 2021,第5期

机译：<内联公式> $ mu $ <替代品> μ “/ renternatives”/ leinline-formula pernvepecker：基于深入的学习系统，用于多级漏洞检测
4. The Deployment of MML for Data Analytics over the Cloud [C] . Tancer Jonathan, Varde Aparna S. 11th IEEE International Conference on Data Mining Workshops . 2011

机译：通过云在数据分析中部署MML
5. The decays of B+→D¯0+DsJ+(2S) and B+→D¯0+DsJ+(1D) [O] . Guo-Li Wang, Jin-Mei Zhang, Zhi-Hui Wang 2009

机译：“si1.gif” 溢出= “滚动” 的xmlns数学altimg =：的 B + → D 0 + d S J + （ 2 s ）和 b + → d 0 + D S J + （ 1 d ）

$run$ runData: Re-Distributing Data via Piggybacking for Geo-Distributed Data Analytics Over Edges

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅