Thrust++: Extending Thrust Framework for Better Abstraction and Performance

机译：Thrust ++：扩展Thrust框架以获得更好的抽象和性能

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

A good design abstraction framework for high performance computing should provide a higher level programming abstraction that strikes a balance between the abstraction and visibility over the hardware so that the software developer can write a portable software without having to understand the hardware nuances, yet exploit the compute power optimally. In this paper we have analyzed a popular design abstraction framework called "Thrust" from NVIDIA, and proposed an extension called Thrust++ that provides abstraction over the memory hierarchy of an NVIDIA GPU. Thrust++ allows developers to make efficient use of shared memory and overall, provides better control over the GPU memory hierarchy while writing applications in Thrust style for the CUDA backend. We have shown that when applications are written for the CUDA backend using Thrust++, they have minimal performance degradation when compared to their equivalent CUDA versions. Further, Thrust++ provides almost 4x speedup when compared to Thrust, for certain compute intensive kernels that repeatedly use the reduce operation.

机译：一个用于高性能计算的好的设计抽象框架应提供更高级别的编程抽象，在硬件的抽象和可见性之间取得平衡，以便软件开发人员可以编写便携式软件，而不必了解硬件的细微差别，但可以利用计算最佳功率。在本文中，我们分析了NVIDIA提供的流行的设计抽象框架“ Thrust”，并提出了一个名为Thrust ++的扩展，该扩展提供了NVIDIA GPU的内存层次结构的抽象。 Thrust ++允许开发人员有效地使用共享内存和整体内存，在为CUDA后端以Thrust样式编写应用程序时，可以更好地控制GPU内存层次结构。我们已经证明，当使用Thrust ++为CUDA后端编写应用程序时，与同等CUDA版本相比，它们的性能下降最小。此外，对于某些重复使用reduce操作的计算密集型内核，Thrust ++与Thrust相比提供了近4倍的加速。

著录项

来源
《2017 IEEE 24th International Conference on High Performance Computing》|2017年|368-377|共10页
会议地点 Jaipur(IN)
作者
Ajai V. George; Sankar Manoj; Sanket R. Gupte; Sayantan Mitra; Santonu Sarkar;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
Graphics processing units; Hardware; Libraries; Computer architecture; Performance evaluation; Programming; Complexity theory;

机译：图形处理单元硬件图书馆计算机体系结构性能评估编程复杂性理论;

相似文献

外文文献
中文文献
专利

1. Kinematic evolution of fold-and-thrust belts in the Yubei-Tangbei area:Implications for tectonic events in the southern Tarim Basin [J] . Yiqiong Zhang, Dengfa He, Bin Wu, 地学前缘(英文版) . 2021,第006期
2. Design and fabrication of a full elastic sub-micron-Newton scale thrust measurement system for plasma micro thrusters [J] . Zhongkai ZHANG, Guanrong HANG, Jiayun QI, 等离子体科学和技术（英文版） . 2021,第010期
3. Design and research of magnetically levitated testbed with composite superconductor bearing for micro thrust measurement [J] . Fawzi DERKAOUI, Zhaoxin LIU, Wenjiang YANG, 等离子体科学和技术（英文版） . 2021,第010期
4. Thrust2D: A new design abstraction framework for structuredrngrid class of algorithms [J] . Santonu Sarkar, Ajai V George, Sankar Manoj Concurrency and computation: practice and experience . 2018,第19期

机译：Thrust2D：针对结构化算法类的新设计抽象框架
5. A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels [J] . Vinod Valsalam, Anthony Skjellum Concurrency and Computation . 2002,第10期

机译：基于分层抽象，算法和优化的低级内核的高性能矩阵乘法框架
6. A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels [J] . Vinod Valsalam, Anthony Skjellum Concurrency and Computation . 2002,第10期

机译：基于分层抽象，算法和优化的低级内核的高性能矩阵乘法框架
7. Thrust++: Extending Thrust Framework for Better Abstraction and Performance [C] . Ajai V. George, Sankar Manoj, Sanket R. Gupte, IEEE International Conference on High Performance Computing . 2017

机译：Thrust ++：扩展推力框架以获得更好的抽象和性能
8. Extending dynamic invariant detection with explicit abstraction [D] . Keith, Daniel Brian 2012

机译：通过显式抽象扩展动态不变检测
9. Extending i2b2 into a framework for semantic abstraction of EHR to facilitate rapid development and portability of Health IT applications [O] . Kavishwar B. Wagholikar, Layne Ainsworth, Vishal P. Vernekar, 2019

机译：将i2b2扩展到EHR语义抽象的框架中以促进Health IT应用程序的快速开发和可移植性
10. Performance and Optimization Abstractions for Large Scale Heterogeneous Systems in the Cactus/Chemora Framework [O] . Erik Schnetter 2013

机译：Cactus / Chemora框架中大规模异构系统的性能和优化抽象

Thrust++: Extending Thrust Framework for Better Abstraction and Performance

摘要

著录项

相似文献

相关主题

期刊订阅