首页> 外文OA文献 >Machine Learning Based Auto-tuning for Enhanced OpenCL Performance Portability
【2h】

Machine Learning Based Auto-tuning for Enhanced OpenCL Performance Portability

机译:基于机器学习的自动调整增强OpenCL性能   可移植性

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Heterogeneous computing, which combines devices with different architectures,is rising in popularity, and promises increased performance combined withreduced energy consumption. OpenCL has been proposed as a standard forprograming such systems, and offers functional portability. It does, however,suffer from poor performance portability, code tuned for one device must bere-tuned to achieve good performance on another device. In this paper, we usemachine learning-based auto-tuning to address this problem. Benchmarks are runon a random subset of the entire tuning parameter configuration space, and theresults are used to build an artificial neural network based model. The modelcan then be used to find interesting parts of the parameter space for furthersearch. We evaluate our method with different benchmarks, on several devices,including an Intel i7 3770 CPU, an Nvidia K40 GPU and an AMD Radeon HD 7970GPU. Our model achieves a mean relative error as low as 6.1%, and is able tofind configurations as little as 1.3% worse than the global minimum.
机译:结合了具有不同架构的设备的异构计算正在日益普及,并有望提高性能并降低能耗。已提出将OpenCL作为对此类系统进行编程的标准,并提供功能上的可移植性。但是,它的性能可移植性差,因此必须重新调整为一个设备调整的代码才能在另一设备上获得良好的性能。在本文中,我们使用基于机器学习的自动调整来解决此问题。在整个调整参数配置空间的随机子集上运行基准测试,结果用于构建基于人工神经网络的模型。然后可以使用该模型查找参数空间中有趣的部分,以供进一步研究。我们在多种设备上使用不同的基准测试来评估我们的方法,包括Intel i7 3770 CPU,Nvidia K40 GPU和AMD Radeon HD 7970GPU。我们的模型实现了平均相对误差低至6.1%,并且能够发现配置比全局最小值低1.3%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号