首页> 外文会议>International conference on Euro-Par >Scheduling Data Flow Program in XKaapi: A New Affinity Based Algorithm for Heterogeneous Architectures
【24h】

Scheduling Data Flow Program in XKaapi: A New Affinity Based Algorithm for Heterogeneous Architectures

机译:XKaapi中的数据流计划调度:一种用于异构架构的基于亲和力的新算法

获取原文

摘要

Efficient implementations of parallel applications on heterogeneous hybrid architectures require a careful balance between computations and communications with accelerator devices. Even if most of the communication time can be overlapped by computations, it is essential to reduce the total volume of communicated data. The literature therefore abounds with ad hoc methods to reach that balance, but these are architecture and application dependent. We propose here a generic mechanism to automatically optimize the scheduling between CPUs and GPUs, and compare two strategies within this mechanism: the classical Heterogeneous Earliest Finish Time (HEFT) algorithm and our new, parametrized, Distributed Affinity Dual Approximation algorithm (DADA), which consists in grouping the tasks by affinity before running a fast dual approximation. We ran experiments on a heterogeneous parallel machine with twelve CPU cores and eight NVIDIA Fermi GPUs. Three standard dense linear algebra kernels from the PLASMA library have been ported on top of the XKaapi runtime system. We report their performances. It results that HEFT and DADA perform well for various experimental conditions, but that DADA performs better for larger systems and number of GPUs, and, in most cases, generates much lower data transfers than HEFT to achieve the same performance.
机译:在异构混合体系结构上并行应用程序的有效实现需要在计算和与加速器设备的通信之间进行仔细的平衡。即使大部分通信时间可以通过计算重叠,也必须减少通信数据的总量。因此,文献中有大量的临时方法可以达到这种平衡,但是这些方法取决于体系结构和应用程序。我们在这里提出一种自动优化CPU和GPU之间调度的通用机制,并在该机制中比较两种策略:经典的异构最早完成时间(HEFT)算法和我们新的参数化分布式亲和对偶近似算法(DADA),包括在运行快速对偶逼近之前按亲和力对任务进行分组。我们在具有十二个CPU内核和八个NVIDIA Fermi GPU的异构并行机上进行了实验。 XSMAAPI运行时系统顶部已移植了来自PLASMA库的三个标准密集线性代数内核。我们报告他们的表演。结果表明,HEFT和DADA在各种实验条件下均表现良好,但DADA在较大的系统和数量的GPU上表现更好,并且在大多数情况下,与HEFT相比,生成低得多的数据传输可达到相同的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号