【24h】

Scenario-Based Execution Method for Massively Parallel Accelerators

机译:大规模并行加速器的基于场景的执行方法

获取原文
获取原文并翻译 | 示例

摘要

The manycore architecture has become one of the choices in the acceleration method of massively parallel computation. It is an unavoidable option for the top supercomputers in the world in order to achieve the high performance computing applying accelerators such as the graphics processing unit (GPU). However such accelerators are equipped on the processing nodes where the CPU cores control the accelerators via the peripheral bus. Therefore, even if the accelerator implements a large amount of parallelism, the performance is inevitably degraded due to the data migration overheads for downloading the kernel program to the accelerator and transferring the I/O data consumed by the program. To avoid the overheads, it is important to pack many tasks in a single kernel and to invoke it for the smallest numbers of execution times of the kernel programs. However, it is very difficult to pack it because it is necessary to exchange the I/O data in recursive operations or among different computing contents. In order to address this problem, the kernel program must invoke many different contents of the program at a single execution of the kernel program. This paper proposes a novel execution mechanism for the accelerators that drastically improves the performance, called the scenario-based execution for the accelerators. It exploits the potential performance of the accelerators, and the application invokes all program contents on the accelerator side. This paper discusses the design and implementation of the scenario-based execution method, and also the performance aspect using a realistic application.
机译:Manycore体系结构已成为大规模并行计算加速方法中的一种选择。对于世界顶级的超级计算机来说,这是不可避免的选择,以实现使用诸如图形处理单元(GPU)之类的加速器的高性能计算。但是,此类加速器安装在处理节点上,在这些处理节点上,CPU内核通过外围总线控制加速器。因此,即使加速器实现了大量的并行性,由于用于将内核程序下载到加速器并传输程序消耗的I / O数据的数据迁移开销,性能也会不可避免地降低。为了避免这些开销,将许多任务打包在一个内核中并以最少数量的内核程序执行时间调用它是很重要的。但是,打包它非常困难,因为有必要在递归操作中或在不同的计算内容之间交换I / O数据。为了解决这个问题,内核程序必须在一次执行内核程序时调用程序的许多不同内容。本文为加速器提出了一种新颖的执行机制,该机制可大大提高性能,称为加速器的基于场景的执行。它利用了加速器的潜在性能,并且应用程序在加速器端调用所有程序内容。本文讨论了基于场景的执行方法的设计和实现,以及使用实际应用程序的性能方面。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号