首页> 外文会议>International Conference on Computational Science and Computational Intelligence >Optimizing a GPU Algorithm through Hardware Profiling Analysis
【24h】

Optimizing a GPU Algorithm through Hardware Profiling Analysis

机译:通过硬件性能分析优化GPU算法

获取原文

摘要

Usage of GPU-based architectures for scientific computing has been steadily increasing in the last years. This new paradigm for both programming and execution has been applied to solve several classic problems much faster than using the conventional multiprocessor and/or multicomputer approach. These architectures allow an increase in performance -- compared to conventional CPU processors -- for specific types of algorithms that are particularly suitable for its greater number of simpler cores which execute one single instruction at a time, each one for different sets of data. Since this is still a relative new technology, GPU device manufacturers as well as independent researchers have published several experiences (success stories), best practices, and optimization guides to aid developers for obtaining the maximum program performance. However, there is still little information about the possible optimizations that can only be harnessed by analyzing the specific device's hardware performance counters. In this paper, we discuss several optimizations based on hardware profiling and share our learned lessons about how such data can be used to optimize a scientific algorithm on a GPU using CUDA.
机译:近年来,基于GPU的体系结构在科学计算中的使用一直在稳步增长。这种用于编程和执行的新范例已比使用传统的多处理器和/或多计算机方法更快地解决了一些经典问题。与传统的CPU处理器相比,这些体系结构可提高性能,从而提高特定类型的算法的性能,这些算法特别适合于其数量众多的更简单的内核,这些内核一次执行一条指令,每条执行不同的数据集。由于这仍然是一项相对新的技术,GPU设备制造商和独立研究人员已经发表了一些经验(成功案例),最佳实践和优化指南,以帮助开发人员获得最大的程序性能。但是,关于可能的优化的信息仍然很少,这些信息只能通过分析特定设备的硬件性能计数器来加以利用。在本文中,我们将讨论基于硬件配置文件的几种优化,并分享有关如何使用此类数据在使用CUDA的GPU上优化科学算法的经验教训。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号