首页> 外文期刊>Astronomy and Computing >Using hybrid GPU/CPU kernel splitting to accelerate spherical convolutions
【24h】

Using hybrid GPU/CPU kernel splitting to accelerate spherical convolutions

机译:使用混合GPU / CPU内核拆分来加速球形卷积

获取原文
获取原文并翻译 | 示例
       

摘要

We present a general method for accelerating by more than an order of magnitude the convolution of pixelated functions on the sphere with a radially-symmetric kernel. Our method splits the kernel into a compact real-space component and a compact spherical harmonic space component. These components can then be convolved in parallel using an inexpensive commodity GPU and a CPU. We provide models for the computational cost of both real-space and Fourier space convolutions and an estimate for the approximation error. Using these models we can determine the optimum split that minimizes the wall clock time for the convolution while satisfying the desired error bounds. We apply this technique to the problem of simulating a cosmic microwave background (CMB) anisotropy sky map at the resolution typical of the high resolution maps produced by the Planck mission. For the main Planck CMB science channels we achieve a speedup of over a factor of ten, assuming an acceptable fractional rms error of order 10(-5) in the power spectrum of the output map. (C) 2015 Elsevier B.V. All rights reserved.
机译:我们提出了一种通用方法,可将球状像素化函数在径向对称核上的卷积加速一个数量级以上。我们的方法将内核分为紧凑的实空间分量和紧凑的球谐空间分量。然后可以使用廉价的商用GPU和CPU并行地对这些组件进行卷积。我们为实空间和傅立叶空间卷积的计算成本提供了模型,并对近似误差进行了估算。使用这些模型,我们可以确定最佳分割,从而使卷积的挂钟时间最小化,同时满足所需的误差范围。我们将此技术应用于以普朗克任务产生的高分辨率地图的典型分辨率模拟宇宙微波背景(CMB)各向异性天空图的问题。对于普朗克CMB的主要科学通道,假设在输出图的功率谱中可接受的10(-5)级均方根误差为10倍,则可实现超过10倍的加速。 (C)2015 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号