Lost in Abstraction: Pitfalls of Analyzing GPUs at the Intermediate Language Level

机译：迷失在抽象：在中级语言水平分析GPU的陷阱

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Modern GPU frameworks use a two-phase compilation approach. Kernels written in a high-level language are initially compiled to an implementation agnostic intermediate language (IL), then finalized to the machine ISA only when the target GPU hardware is known. Most GPU microarchitecture simulators available to academics execute IL instructions because there is substantially less functional state associated with the instructions, and in some situations, the machine ISA's intellectual property may not be publicly disclosed. In this paper, we demonstrate the pitfalls of evaluating GPUs using this higher-level abstraction, and make the case that several important microarchitecture interactions are only visible when executing lower-level instructions. Our analysis shows that given identical application source code and GPU microarchitecture models, execution behavior will differ significantly depending on the instruction set abstraction. For example, our analysis shows the dynamic instruction count of the machine ISA is nearly 2× that of the IL on average, but contention for vector registers is reduced by 3× due to the optimized resource utilization. In addition, our analysis highlights the deficiencies of using IL to model instruction fetching, control divergence, and value similarity. Finally, we show that simulating IL instructions adds 33% error as compared to the machine ISA when comparing absolute runtimes to real hardware.

机译：现代GPU框架使用两阶段编译方法。用高级语言编写的内核最初被编译为与实现无关的中间语言（IL），然后仅在目标GPU硬件已知的情况下最终确定为机器ISA。学者可使用的大多数GPU微体系结构模拟器都执行IL指令，因为与指令相关的功能状态要少得多，并且在某些情况下，可能不会公开披露机器ISA的知识产权。在本文中，我们演示了使用这种较高级别的抽象来评估GPU的陷阱，并提出了几个重要的微体系结构交互仅在执行较低级别的指令时才可见的情况。我们的分析表明，在给定相同的应用程序源代码和GPU微体系结构模型的情况下，执行行为将因指令集抽象的不同而有显着差异。例如，我们的分析表明，机器ISA的动态指令计数平均约为IL的2倍，但由于优化了资源利用率，向量寄存器的争用减少了3倍。此外，我们的分析突出了使用IL对指令提取，控制差异和值相似性进行建模的不足。最后，我们证明了在将绝对运行时间与实际硬件进行比较时，与机器ISA相比，仿真IL指令会增加33％的错误。

著录项

来源
《IEEE International Symposium on High Performance Computer Architecture》|2018年|608-619|共12页
会议地点
作者
Anthony Gutierrez; Bradford M. Beckmann; Alexandru Dutu; Joseph Gross; Michael LeBeane; John Kalamatianos; Onur Kayiran; Matthew Poremba; Brandon Potter; Sooraj Puthoor; Matthew D. Sinclair; Mark Wyse; Jieming Yin; Xianwei Zhang; Akshay Jain; Timothy Rogers;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Kernel; Graphics processing units; Registers; Hardware; Computer architecture; Microarchitecture; Runtime;

机译：内核;图形处理单元;寄存器;硬件;计算机体系结构;微体系结构;运行时;

相似文献

外文文献
中文文献
专利

1. GPULib: GPU Computing in High-Level Languages [J] . Messmer Peter, Mullowney Paul J., Granger Brian E. Computing in science & engineering . 2008,第5期

机译：GPULib：高级语言中的GPU计算
2. Hydrogen abstraction from n-butanol by the hydroxyl radical: High level Ab initio study of the relative significance of various abstraction channels and the role of weakly bound intermediates [J] . Moc J., Simmie J.M. The journal of physical chemistry, A. Molecules, spectroscopy, kinetics, environment, & general theory . 2010,第17期

机译：通过羟基自由基从正丁醇中提取氢：各种提取通道的相对重要性以及弱结合中间体的作用的高水平从头算研究
3. LOW LEVEL OBJECT ORIENTED LENGUAGE IN CONTEXT OF COMPARATIVE STUDY BETWEEN TWO LOW LEVEL LANGUAGES MSIL(MICROSOFT INTERMEDIATE LANGUAGE)OR CIL AND JAVA BYTE CODE [J] . Amit Juyal, Ashish Pal, Janmejay Pant International Journal of Engineering Science and Technology . 2012,第6期

机译：两种低水平语言MSIL（微软中间语言）或CIL与JAVA BYTE代码之间的比较研究中的低水平面向对象语言
4. Lost in Abstraction: Pitfalls of Analyzing GPUs at the Intermediate Language Level [C] . Anthony Gutierrez, Bradford M. Beckmann, Alexandru Dutu, IEEE International Symposium on High Performance Computer Architecture . 2018

机译：在抽象中丢失：在中间语言级别分析GPU的陷阱
5. Comprehensive Protection for Dynamically-typed Languages: Avoiding the Pitfalls of Language-level Sandboxing [D] . Park, Taemin. 2020

机译：用于动态类型语言的全面保护：避免语言级沙箱的陷阱
6. CUDASW++2.0: enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions [O] . Yongchao Liu, Bertil Schmidt, Douglas L Maskell 2010

机译：CUDASW ++ 2.0：基于SIMT和虚拟SIMD抽象在基于CUDA的GPU上增强了Smith-Waterman蛋白质数据库搜索
7. A Mapping Path for multi-GPGPU Accelerated Computers from a Portable High Level Programming Abstraction [O] . Allen Leung, Nicolas Vasilache, Benoît Meister, 2010

机译：从便携式高级编程抽象中获取多GpGpU加速计算机的映射路径

Lost in Abstraction: Pitfalls of Analyzing GPUs at the Intermediate Language Level

摘要

著录项

相似文献

相关主题

期刊订阅