Real-World Design and Evaluation of Compiler-Managed GPU Redundant Multithreading

机译：编译器管理GPU冗余多线程的现实世界设计与评估

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Reliability for general purpose processing on the GPU (GPGPU) is becoming a weak link in the construction of reliable supercomputer systems. Because hardware protection is expensive to develop, requires dedicated on-chip resources, and is not portable across different architectures, the efficiency of software solutions such as redundant multithreading (RMT) must be explored. This paper presents a real-world design and evaluation of automatic software RMT on GPU hardware. We first describe a compiler pass that automatically converts GPGPU kernels into redundantly threaded versions. We then perform detailed power and performance evaluations of three RMT algorithms, each of which provides fault coverage to a set of structures in the GPU. Using real hardware, we show that compiler-managed software RMT has highly variable costs. We further analyze the individual costs of redundant work scheduling, redundant computation, and inter-thread communication, showing that no single component in general is responsible for high overheads across all applications; instead, certain workload properties tend to cause RMT to perform well or poorly. Finally, we demonstrate the benefit of architectural support for RMT with a specific example of fast, register-level thread communication.

机译：GPU上通用处理的可靠性（GPGPU）正在成为可靠的超级计算机系统构造中的薄弱环节。由于硬件保护要开发昂贵，需要专用的片上资源，并且不在不同架构上便携，必须探索冗余多线程（RMT）等软件解决方案的效率。本文介绍了GPU硬件上自动软件RMT的真实设计和评估。我们首先描述一个编译器通过，它会自动将GPGPU内核转换为冗余的线程版本。然后，我们执行三个RMT算法的详细功率和性能评估，每个功率和性能评估每个都为GPU中的一组结构提供故障覆盖。使用真实硬件，我们显示编译器管理的软件RMT具有高度可变成本。我们进一步分析了冗余工作调度，冗余计算和线程间通信的个别成本，显示没有单个组件通常是对所有应用的高开销负责;相反，某些工作负载属性往往会导致RMT执行良好或不佳。最后，我们展示了RMT的架构支持的益处，具有快速寄存器级线程通信的具体示例。

著录项

来源
《ACM/IEEE International Symposium on Computer Architecture》|2014年||共12页
会议地点
作者
Jack Wadden; Alexander Lyashevsky; Sudhanva Gurumurthit; Vilas Sridharan; Kevin Skadron;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP303-53;
关键词
Compiler-Managed; RMT; GPU;

机译：编译管理;RMT;GPU;

相似文献

外文文献
中文文献
专利

1. Real-World Design and Evaluation of Compiler-Managed GPU Redundant Multithreading [J] . Jack Wadden, Alexander Lyashevsky, Sudhanva Gurumurthi, Computer architecture news . 2014,第3期

机译：编译器管理的GPU冗余多线程的实际设计和评估
2. Design and Performance Evaluation of Image Processing Algorithms on GPUs [J] . Park In Kyu, Singhal Nitin, Lee Man Hee, Parallel and Distributed Systems, IEEE Transactions on . 2011,第1期

机译：GPU上图像处理算法的设计和性能评估
3. Parallel multithreaded IDA* heuristic search: algorithm design and performance evaluation [J] . Basel A. Mahafzah International Journal of Parallel, Emergent and Distributed Systems . 2011,第1期

机译：并行多线程IDA *启发式搜索：算法设计和性能评估
4. Real-World Design and Evaluation of Compiler-Managed GPU Redundant Multithreading [C] . Jack Wadden, Alexander Lyashevsky, Sudhanva Gurumurthit, ACM/IEEE International Symposium on Computer Architecture . 2014

机译：编译器管理GPU冗余多线程的现实世界设计与评估
5. Multicore processor and hardware transactional memory design space evaluation and optimization using multithreaded workload synthesis. [D] . Hughes, Clayton M. 2010

机译：使用多线程工作负载综合的多核处理器和硬件事务性存储器设计空间评估和优化。
6. Systematic Review of Real-World Studies Evaluating CharacteristicsAssociated With or Programs Designed to Facilitate Outpatient Management of DeepVein Thrombosis [O] . Erin R. Weeda, Sofia Butt 2018

机译：系统评价真实世界研究的特征旨在促进深层门诊管理的相关计划静脉血栓形成
7. Detailed Design and Evaluation of Redundant Multithreading Alternatives [O] . Shubhendu S. Mukherjee et al. 2002

机译：冗余多线程替代方案的详细设计与评估

Real-World Design and Evaluation of Compiler-Managed GPU Redundant Multithreading

摘要

著录项

相似文献

相关主题

期刊订阅