【24h】

Data distribution and scheduling for distributed analytics tasks

机译:分布式分析任务的数据分发和计划

获取原文
获取原文并翻译 | 示例

摘要

We consider a distributed edge computing system where we have a number of interconnected machines with limited communication bandwidth and storage capacity. Analytics tasks run on the machines, where each task runs on a single machine but may require data from multiple other machines. Every task requires a given amount of data to run, and it needs to receive all its data within a specific deadline. The application scenario is that each machine has limited storage, thus we usually cannot place the entire amount of data for a specific task on a single machine that executes the task. We assume that the task execution is sparse in time, so that at most one task is executed in the system at any time. The problem we study in this paper is how to distribute the data on machines in the system, without violating the bandwidth and storage constraints, while ensuring that the data transfer deadlines are met. We prove that the optimal solution to this problem is equivalent to that of a max-flow problem on a specifically constructed graph. We present how to construct this graph so that the problem can be solved using standard algorithms for max-flow problems, and also provide some numerical results and further discussions.
机译:我们考虑一个分布式边缘计算系统,其中有许多互连的计算机,它们的通信带宽和存储容量有限。 Analytics(分析)任务在计算机上运行,​​其中每个任务在单台计算机上运行,​​但可能需要来自其他多台计算机的数据。每个任务都需要一定数量的数据才能运行,并且它需要在特定期限内接收所有数据。应用场景是每台计算机的存储空间有限,因此我们通常无法将特定任务的全部数据放在执行任务的单台计算机上。我们假设任务执行的时间是稀疏的,因此最多可以在系统中的任何时间执行一项任务。我们在本文中研究的问题是如何在不违反带宽和存储约束的情况下如何在系统中的计算机上分发数据,同时确保满足数据传输的最后期限。我们证明此问题的最佳解决方案与特定构造图上的最大流量问题等效。我们介绍了如何构造该图,以便可以使用标准算法解决最大流量问题,并提供一些数值结果和进一步的讨论。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号