首页> 外文会议>IEEE International Conference on High Performance Computing >Thrust++: Extending Thrust Framework for Better Abstraction and Performance
【24h】

Thrust++: Extending Thrust Framework for Better Abstraction and Performance

机译:Thrust ++:扩展推力框架以获得更好的抽象和性能

获取原文

摘要

A good design abstraction framework for high performance computing should provide a higher level programming abstraction that strikes a balance between the abstraction and visibility over the hardware so that the software developer can write a portable software without having to understand the hardware nuances, yet exploit the compute power optimally. In this paper we have analyzed a popular design abstraction framework called "Thrust" from NVIDIA, and proposed an extension called Thrust++ that provides abstraction over the memory hierarchy of an NVIDIA GPU. Thrust++ allows developers to make efficient use of shared memory and overall, provides better control over the GPU memory hierarchy while writing applications in Thrust style for the CUDA backend. We have shown that when applications are written for the CUDA backend using Thrust++, they have minimal performance degradation when compared to their equivalent CUDA versions. Further, Thrust++ provides almost 4x speedup when compared to Thrust, for certain compute intensive kernels that repeatedly use the reduce operation.
机译:高性能计算的良好设计抽象框架应提供更高级别的编程抽象,可在硬件上击中抽象和可见性之间的平衡,以便软件开发人员可以编写便携式软件,而无需了解硬件细微差别,但漏洞利用计算功率最佳。在本文中,我们分析了一个流行的设计抽象框架,称为“来自NVIDIA的”推力“,并提出了一个名为Thrust ++的扩展,该扩展提供了NVIDIA GPU的内存层次结构的抽象。 Thrust ++允许开发人员能够有效地利用共享内存和总体,在为CUDA后端的推力风格编写应用程序时,可以更好地控制GPU存储层次结构。我们已经表明,当使用推力++编写了用于CUDA后端的应用程序时,与其等效的CUDA版本相比,它们具有最小的性能下降。此外,推力与推力相比,推力++提供了几乎4倍的加速,对于一定的计算密集型内核,反复使用缩小操作。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号