Evaluating optimizations that reduce globalmemory accesses of stencil computations in GPGPUs

Thiago Carrijo Nasciutti; Jairo Panetta; Pedro Pais Lopes

首页> 外文期刊>Concurrency, practice and experience >Evaluating optimizations that reduce globalmemory accesses of stencil computations in GPGPUs

【24h】

Evaluating optimizations that reduce globalmemory accesses of stencil computations in GPGPUs

机译：评估减少GPGPU中模板计算的全局内存访问的优化

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This work compares the performance of optimizations that transform replicated global memoryaccesses into localmemoryaccesses on3Dstencil computations in theNVIDIATeslaK80GPGPU.The optimizations reduce global memory contention caused by the set of multiprocessors. Evaluatedoptimizations are grid tiling, inserting spatial and temporal loops into kernels, register reuse,andsomeof their combinations.Astandardized experiment evaluates performance variationwithgrid size and stencil size for each optimization. Experimental data show that codes that use theseoptimizations are up to 3.3 times faster than the classical stencil formulation. It also shows thatthemost profitable optimization varieswith grid and stencil sizes.

机译：这项工作比较了在NVIDIA TeslaK80GPGPU的3D模板计算中将复制的全局内存 r n访问转换为本地内存访问的优化的性能。 r n这些优化减少了由多处理器集引起的全局内存争用。评估的 r n优化是网格平铺，将空间和时间循环插入内核，寄存器重用，及其组合中的一些组合。标准化实验评估每个优化的性能随r ngrid尺寸和模板尺寸的变化。实验数据表明，使用这些 r noptimizations的代码的速度比传统的模板制作速度快3.3倍。它还显示 r 最有利可图的优化随网格和模具尺寸而变化。

著录项

来源
《Concurrency, practice and experience》 |2019年第18期|e4929.1-e4929.16|共16页
作者
Thiago Carrijo Nasciutti; Jairo Panetta; Pedro Pais Lopes;
展开▼
作者单位

Divisao de Ciencia da Computacao, InstitutoTecnologico de Aeronautica (ITA), Sao Jose dosCampos, Sao Paulo, Brazil;

Divisao de Ciencia da Computacao, InstitutoTecnologico de Aeronautica (ITA), Sao Jose dosCampos, Sao Paulo, Brazil;

Divisao de Ciencia da Computacao, InstitutoTecnologico de Aeronautica (ITA), Sao Jose dosCampos, Sao Paulo, Brazil;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
GPGPU; memory hierarchy; stencil computation;

机译：GPGPU;记忆层次结构;模板计算;

相似文献

外文文献
中文文献
专利

1. Evaluating optimizations that reduce globalmemory accesses of stencil computations in GPGPUs [J] . Thiago Carrijo Nasciutti, Jairo Panetta, Pedro Pais Lopes Concurrency, practice and experience . 2019,第18期

机译：评估减少GPGPU中的模板计算的GlobalMemory访问的优化
2. ACCELERATING STENCIL COMPUTATION ON GPGPU BY NOVEL MAPPING METHOD BETWEEN THE GLOBAL MEMORY AND THE SHARED MEMORY [J] . Mo Tieqiang, Li Renfa Computing and informatics . 2018,第3期

机译：全局内存和共享内存之间通过新颖的映射方法在GPGPU上加速钢笔计算
3. A new memory mapping mechanism for GPGPUs' stencil computation [J] . Mo Tieqiang, Li Renfa Computing . 2015,第8期

机译：GPGPU模板计算的新内存映射机制
4. PADS: A Pattern-Driven Stencil Compiler-Based Tool for Reuse of Optimizations on GPGPUs [C] . Han Dongni, Xu Shixiong, Chen Li, 2011 17th IEEE International Conference on Parallel and Distributed Systems . 2011

机译：PADS：一种基于模式驱动的模具编译器的工具，用于在GPGPU上重复使用优化
5. Optimization of Stencil Computations on GPUs [D] . Rawat, Prashant Singh. 2018

机译：在GPU上优化模板计算
6. Optimizing Data Intensive GPGPU Computations for DNA Sequence Alignment [O] . Cole Trapnell, Michael C. Schatz -1

机译：优化DNA序列对齐的数据密集型GPGPU计算
7. 1Parallel Visual Data Restoration on Multi-GPGPUs using Stencil-Reduce Pattern [O] . Marco Aldinucci, Guilherme Peretti Pezzi, Maurizio Drocco, 2016

机译：使用模板减少模式在多GpGpU上恢复1个并行视觉数据

Evaluating optimizations that reduce globalmemory accesses of stencil computations in GPGPUs

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅