...
首页> 外文期刊>Data in Brief >Searching CUDA code autotuning spaces with hardware performance counters: data from benchmarks running on various GPU architectures
【24h】

Searching CUDA code autotuning spaces with hardware performance counters: data from benchmarks running on various GPU architectures

机译:使用硬件性能计数器搜索CUDA代码自动空间:来自各种GPU架构上运行的基准的数据

获取原文
           

摘要

We have developed several autotuning benchmarks in CUDA that take into account performance-relevant source-code parameters and reach near peak-performance on various GPU architectures. We have used them during the development and evaluation of a search method for tuning space proposed in . With our framework Kernel Tuning Toolkit, freely available at Github, we measured computation times and hardware performance counters on several GPUs for the complete tuning spaces of five benchmarks. These data, which we provide here, might benefit research of search algorithms for the tuning spaces of GPU codes or research of relation between applied code optimization, hardware performance counters, and GPU kernels’ performance.Moreover, we describe the scripts we used for robust evaluation of our searcher and comparison to others in detail. In particular, the script that simulates the tuning, i.e., replaces time-demanding compiling and executing the tuned kernels with a quick reading of the computation time from our measured data, makes it possible to inspect the convergence of tuning search over a large number of experiments. These scripts, freely available with our other codes, make it easier to experiment with search algorithms and compare them in a robust and reproducible way.During our research, we generated models for predicting values of performance counters from values of tuning parameters of our benchmarks. Here, we provide the models themselves and describe the scripts we implemented for their training. These data might benefit researchers who want to reproduce or build on our research.
机译:我们在CUDA开发了几个Autoduning基准,考虑到性能相关的源代码参数并在各种GPU架构上达到近峰值性能。我们在开发和评估过程中使用它们进行了调整空间的搜索方法。使用我们的框架内核调谐工具包,在Github上自由提供,我们在几个GPU上测量了计算时间和硬件性能计数器,以获得五个基准的完整调谐空间。我们在此提供的这些数据可能会受益于搜索算法的研究,用于GPU代码的调整空间或应用代码优化,硬件性能计数器和GPU内核的关系的关系.Orouse,我们描述了我们用于强大的脚本评估我们的搜索者并详细比较他人。特别地,模拟调谐的脚本,即替换从我们的测量数据的计算时间快速读取计算时间的时间要求的编译和执行调谐内核,使得可以检查大量调整搜索的融合实验。这些脚本自由地使用我们的其他代码可以更轻松地尝试搜索算法并以强大而可重复的方式进行比较。我们的研究,我们生成了从我们基准调整参数的调整参数的值预测性能计数器值的模型。在这里,我们提供模型本身并描述我们为其培训实施的脚本。这些数据可能有利于希望重现或建立在我们的研究中的研究人员。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号