首页> 外文期刊>IEEE Transactions on Computers >GPU Instruction Hotspots Detection Based on Binary Instrumentation Approach
【24h】

GPU Instruction Hotspots Detection Based on Binary Instrumentation Approach

机译:基于二进制检测方法的GPU指令热点检测

获取原文
获取原文并翻译 | 示例

摘要

The problem of profiling a compute kernel running on the CPU is mostly solved with the help of technologies that explore a code behavior in detail. But with a last decade trend when computation spreads to other devices, more power and performance efficient, we face a high need for fine-grain code profiling on such devices. Traditional methods are not always sufficient: program counter sampling requires special hardware support, and performance simulation works slowly. In this paper, we introduce a novel method for instruction hotspots detection based on the binary instrumentation approach and demonstrate it on Intel (R) Graphics. This method relies on three key principles: measurement with instruction block granularity, conscious placement of probes, and combination of static and runtime information. We demonstrate its ability to highlight the hottest instructions and lines of code, and its relatively low runtime overhead comparable to a native run. The method is applicable to GPUs and accelerators with in-order architecture, and could be used in a rapidly growing segment of accelerator solutions for computer vision and artificial intelligence.
机译:对运行在CPU上的计算内核进行性能分析的问题大部分是通过详细研究代码行为的技术来解决的。但是随着最近十年趋势,即计算扩展到其他设备,更高的功能和更高的性能,我们非常需要在此类设备上进行细粒度的代码分析。传统方法并不总是足够的:程序计数器采样需要特殊的硬件支持,而性能模拟的运行速度很慢。在本文中,我们介绍了一种基于二进制检测方法的指令热点检测新方法,并在Intel(R)Graphics上进行了演示。该方法依赖于三个关键原则:使用指令块粒度进行测量,有意识地放置探针以及静态和运行时信息的组合。我们展示了其突出显示最热门的指令和代码行的能力,以及与本机运行相当的较低运行时开销。该方法适用于具有顺序结构的GPU和加速器,并且可以用于计算机视觉和人工智能的加速器解决方案的快速增长领域。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号