...
首页> 外文期刊>Computer architecture news >Real-World Design and Evaluation of Compiler-Managed GPU Redundant Multithreading
【24h】

Real-World Design and Evaluation of Compiler-Managed GPU Redundant Multithreading

机译:编译器管理的GPU冗余多线程的实际设计和评估

获取原文
获取原文并翻译 | 示例

摘要

Reliability for general purpose processing on the GPU (GPGPU) is becoming a weak link in the construction of reliable supercomputer systems. Because hardware protection is expensive to develop, requires dedicated on-chip resources, and is not portable across different architectures, the efficiency of software solutions such as redundant multithreading (RMT) must be explored. This paper presents a real-world design and evaluation of automatic software RMT on GPU hardware. We first describe a compiler pass that automatically converts GPGPU kernels into redundantly threaded versions. We then perform detailed power and performance evaluations of three RMT algorithms, each of which provides fault coverage to a set of structures in the GPU. Using real hardware, we show that compiler-managed software RMT has highly variable costs. We further analyze the individual costs of redundant work scheduling, redundant computation, and inter-thread communication, showing that no single component in general is responsible for high overheads across all applications; instead, certain workload properties tend to cause RMT to perform well or poorly. Finally, we demonstrate the benefit of architectural support for RMT with a specific example of fast, register-level thread communication.
机译:在可靠的超级计算机系统的构建中,GPU(GPGPU)上通用处理程序的可靠性已成为薄弱环节。由于硬件保护的开发成本高昂,需要专用的片上资源,并且不能跨不同的体系结构移植,因此必须探索诸如冗余多线程(RMT)之类的软件解决方案的效率。本文介绍了在GPU硬件上自动软件RMT的真实设计和评估。我们首先描述一个编译器通道,该通道可以自动将GPGPU内核转换为冗余线程版本。然后,我们对三种RMT算法进行详细的功耗和性能评估,每种算法都为GPU中的一组结构提供故障覆盖。通过使用实际硬件,我们证明了编译器管理的软件RMT具有高度可变的成本。我们进一步分析了冗余工作计划,冗余计算和线程间通信的各个成本,结果表明,通常没有任何单个组件负责所有应用程序的高开销;取而代之的是,某些工作负载属性往往会导致RMT性能良好或较差。最后,我们以快速的寄存器级线程通信的特定示例展示了对RMT的体系结构支持的好处。

著录项

  • 来源
    《Computer architecture news》 |2014年第3期|73-84|共12页
  • 作者单位

    University of Virginia, Charlottesville, Virginia, USA;

    AMD Research, Advanced Micro Devices, Inc., Sunnyvale, CA, USA;

    AMD Research, Advanced Micro Devices, Inc., Boxborough, MA, USA;

    RAS Architecture, Advanced Micro Devices, Inc., Boxborough, MA, USA;

    University of Virginia, Charlottesville, Virginia, USA;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号