首页> 外文期刊>Scientific programming >Manycore Performance-Portability: Kokkos Multidimensional Array Library
【24h】

Manycore Performance-Portability: Kokkos Multidimensional Array Library

机译:Manycore性能-可移植性:Kokkos多维阵列库

获取原文
       

摘要

Large, complex scientific and engineering application code have a significant investment in computational kernels to implement their mathematical models. Porting these computational kernels to the collection of modern manycore accelerator devices is a major challenge in that these devices have diverse programming models, application programming interfaces (APIs), and performance requirements. The Kokkos Array programming model provides library-based approach to implement computational kernels that are performance-portable to CPU-multicore and GPGPU accelerator devices. This programming model is based upon three fundamental concepts: (1) manycore compute devices each with its own memory space, (2) data parallel kernels and (3) multidimensional arrays. Kernel execution performance is, especially for NVIDIA® devices, extremely dependent on data access patterns. Optimal data access pattern can be different for different manycore devices – potentially leading to different implementations of computational kernels specialized for different devices. The Kokkos Array programming model supports performance-portable kernels by (1) separating data access patterns from computational kernels through a multidimensional array API and (2) introduce device-specific data access mappings when a kernel is compiled. An implementation of Kokkos Array is available through Trilinos [Trilinos website, http://trilinos.sandia.gov/, August 2011].
机译:大型,复杂的科学和工程应用程序代码对用于实现其数学模型的计算内核进行了大量投资。将这些计算内核移植到现代多核加速器设备的集合中是一个重大挑战,因为这些设备具有各种编程模型,应用程序编程接口(API)和性能要求。 Kokkos阵列编程模型提供了一种基于库的方法来实现计算内核,这些内核性能可移植到CPU多核和GPGPU加速器设备。该编程模型基于三个基本概念:(1)多核计算设备,每个计算设备都有自己的内存空间;(2)数据并行内核;(3)多维数组。内核执行性能非常依赖于数据访问模式,尤其是对于NVIDIA®设备而言。最佳的数据访问模式对于不同的多核设备可能有所不同-可能导致专门针对不同设备的计算内核的不同实现。 Kokkos阵列编程模型通过(1)通过多维阵列API将数据访问模式与计算内核分开,以及(2)在编译内核时引入特定于设备的数据访问映射来支持性能可移植的内核。可通过Trilinos [Trilinos网站,http://trilinos.sandia.gov/,2011年8月]获得Kokkos Array的实现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号