首页> 外文会议>International Conference series on Parallel Computing >Using Task-Based Parallelism Directly on the GPU for Automated Asynchronous Data Transfer
【24h】

Using Task-Based Parallelism Directly on the GPU for Automated Asynchronous Data Transfer

机译:直接在GPU上使用基于任务的并行性,用于自动异步数据传输

获取原文

摘要

We present a framework, based on the QuickSched[1] library, that implements priority-aware task-based parallelism directly on CUDA GPUs. This allows large computations with complex data dependencies to be executed in a single GPU kernel call, removing any synchronization points that might otherwise be required between kernel calls. Using this paradigm, data transfers to and from the GPU are modelled as load and unload tasks. These tasks are automatically generated and executed alongside the rest of the computational tasks, allowing fully asynchronous and concurrent data transfers. We implemented a tiled-QR decomposition, and a Barnes-Hut gravity calculation, both of which show significant improvement when utilising the task-based setup, effectively eliminating any latencies due to data transfers between the GPU and the CPU. This shows that task-based parallelism is a valid alternative programming paradigm on GPUs, and can provide significant gains from both a data transfer and ease-of-use perspective.
机译:我们介绍了一个框架,基于QuickSched [1]库,它直接在CUDA GPU上实现优先感知任务的并行性。这允许在单个GPU内核呼叫中执行具有复杂数据依赖性的大计算,从而删除内核呼叫之间可能需要的任何同步点。使用此范例,数据传输到GPU的数据被建模为负载和卸载任务。这些任务将自动生成并与其余的计算任务一起生成并执行,允许完全异步和并发数据传输。我们实施了倾斜QR分解,以及Barnes-Hut重力计算,两者都在利用基于任务的设置时显示出显着的改进,有效地消除了由于GPU和CPU之间的数据传输而导致的任何延迟。这表明基于任务的并行性是GPU上的有效替代编程范例,并且可以从数据传输和易用性角度提供显着的增益。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号