A Cluster-Based Data-Centric Model for Network-Aware Task Scheduling in Distributed Systems

Ugo Fiore; Francesco Palmieri; Aniello Castiglione; Alfredo De Santis

首页> 外文期刊>International journal of parallel programming >A Cluster-Based Data-Centric Model for Network-Aware Task Scheduling in Distributed Systems

【24h】

A Cluster-Based Data-Centric Model for Network-Aware Task Scheduling in Distributed Systems

机译：分布式系统中基于网络的任务调度的基于集群的数据中心模型

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Big Data processing architectures are now widely recognized as one of the most significant innovations in Computing in the last decade. Their enormous potential in collecting and processing huge volumes of data scattered throughout the Internet is opening the door to a new generation of fully distributed applications that, by leveraging the large amount of resources available on the network will be able to cope with very complex problems achieving performances never seen before. However, the Internet is known to have severe scalability limitations in moving very large quantities of data, and such limitations introduce the challenge of making efficient use of the computing and storage resources available on the network, in order to enable data-intensive applications to be executed effectively in such a complex distributed environment. This implies resource scheduling decisions which drive the execution of task towards the data by taking network load and capacity into consideration to maximize data access performance and reduce queueing and processing delays as possible. Accordingly, this work presents a data-centric meta-scheduling scheme for fully distributed Big Data processing architectures based on clustering techniques whose goal is aggregating tasks around storage repositories and driven by a new concept of "gravitational" attraction between the tasks and their data of interest. This scheme will benefit from heuristic criteria based on network awareness and advance resource reservation in order to suppress long delays in data transfer operations and result into an optimized use of data storage and runtime resources at the expense of a limited (polynomial) computational complexity.

机译：大数据处理架构现在被广泛认为是过去十年中计算机领域最重要的创新之一。它们在收集和处理分散在整个Internet上的大量数据方面的巨大潜力为新一代完全分布式应用程序打开了大门，这些应用程序通过利用网络上可用的大量资源，将能够解决非常复杂的问题，从而实现从未见过的表演。但是，众所周知，Internet在移动大量数据时具有严重的可伸缩性限制，而这些限制带来了挑战，即如何有效利用网络上可用的计算和存储资源，从而使数据密集型应用程序成为可能。在如此复杂的分布式环境中有效执行。这意味着资源调度决策，通过考虑网络负载和容量来最大化数据访问性能并尽可能减少排队和处理延迟，从而推动任务朝着数据执行。因此，这项工作提出了一种基于聚类技术的完全分布式大数据处理体系结构的以数据为中心的元调度方案，该聚类技术的目标是在存储库周围聚集任务，并由任务及其数据之间的“引力”吸引新概念驱动利益。该方案将受益于基于网络意识的启发式标准并提前进行资源预留，以抑制数据传输操作中的长时间延迟，并以有限的（多项式）计算复杂度为代价，优化使用数据存储和运行时资源。

著录项

来源
《International journal of parallel programming》 |2014年第5期|755-775|共21页
作者
Ugo Fiore; Francesco Palmieri; Aniello Castiglione; Alfredo De Santis;
展开▼
作者单位

Information Services Center, University of Naples Federico Ⅱ, Via Cinthia 5, 80126 Napoli, Italy;

Department of Industrial and Information Engineering, Second University of Naples, Via Roma 29, 81031 Aversa, Italy;

Department of Computer Science, University of Salerno, Via Ponte don Melillo, 84084 Fisciano (SA), Italy;

Department of Computer Science, University of Salerno, Via Ponte don Melillo, 84084 Fisciano (SA), Italy;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Task scheduling; Clustering; Big data processing; Distributed systems; Meta-scheduling; k-Means; Resource reservation;

机译：任务调度;集群;大数据处理;分布式系统;元调度;k-均值;资源预留;

相似文献

外文文献
中文文献
专利

1. The data-centric future: Stan Schneider describes how pervasive data in distributed applications can simplify the design task for complex embedded systems [J] . COLIN HOLLAND Embedded Systems Europe . 2006,第78期

机译：以数据为中心的未来：Stan Schneider描述了分布式应用程序中的普适数据如何简化复杂嵌入式系统的设计任务
2. Probabilistic Reliability Prediction Models for Task Scheduling in Distributed Systems: A Review [J] . Faruku Umar Ambursa, Rohaya Latip, Azizol Abdullah, Journal of Engineering & Applied Sciences . 2017,第3期

机译：分布式系统任务调度的概率可靠性预测模型：评论
3. Modeling Clustered Task Graphs for Scheduling Large Parallel Programs in Distributed Systems [J] . Concepcio Roig, Ana Ripoll, Emilio Luque Simulation . 2004,第4a5期

机译：建模群集任务图以调度分布式系统中的大型并行程序
4. Distributed scheduler for high performance data-centric systems [C] . Goel, S., Sharda, . 2003

机译：用于高性能数据中心系统的分布式调度程序
5. Energy and Performance-Optimized Scheduling of Tasks in Distributed Cloud and Edge Computing Systems [D] . ?Yuan, Haitao 2020

机译：分布式云和边缘计算系统中任务的能量和性能优化调度
6. Metaheuristic Based Scheduling Meta-Tasks in Distributed Heterogeneous Computing Systems [O] . Hesam Izakian, Ajith Abraham, Václav Snášel 2009

机译：分布式异构计算系统中基于元启发式的调度元任务
7. A simulation model of task cluster scheduling in distributed systems [O] . Helen D. Karatza 1999

机译：分布式系统任务集群调度仿真模型

A Cluster-Based Data-Centric Model for Network-Aware Task Scheduling in Distributed Systems

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅