首页> 外文学位 >Memory Access Patterns for Cellular Automata Using GPGPUs.
【24h】

Memory Access Patterns for Cellular Automata Using GPGPUs.

机译:使用GPGPU的元胞自动机的内存访问模式。

获取原文
获取原文并翻译 | 示例

摘要

Today's graphical processing units have hundreds of individual processing cores that can be used for general purpose computation of mathematical and scientific problems. Due to their hardware architecture, these devices are especially effective when solving problems that exhibit a high degree of spatial locality. Cellular automata use small, local neighborhoods to determine successive states of individual elements and therefore, provide an excellent opportunity for the application of general purpose GPU computing. However, the GPU presents a challenging environment because it lacks many of the features of traditional CPUs, such as automatic, on-chip caching of data. To fully realize the potential of a GPU, specialized memory techniques and patterns must be employed to account for their unique architecture. Several techniques are presented which not only dramatically improve performance, but, in many cases, also simplify implementation. Many of the approaches discussed relate to the organization of data in memory or patterns for accessing that data, while others detail methods of increasing the computation to memory access ratio. The ideas presented are generic, and applicable to cellular automata models as a whole. Example implementations are given for several problems, including the Game of Life and Gaussian blurring, while performance characteristics, such as instruction and memory accesses counts, are analyzed and compared. A case study is detailed, showing the effectiveness of the various techniques when applied to a larger, real-world problem. Lastly, the reasoning behind each of the improvements is explained, providing general guidelines for determining when a given technique will be most and least effective.
机译:当今的图形处理单元具有数百个单独的处理核心,可用于数学和科学问题的通用计算。由于其硬件架构,这些设备在解决表现出高度空间局部性的问题时特别有效。元胞自动机使用小的局部邻域来确定各个元素的连续状态,因此,为通用GPU计算的应用提供了绝佳的机会。但是,GPU提出了一个具有挑战性的环境,因为它缺乏传统CPU的许多功能,例如自动的片上数据缓存。为了完全实现GPU的潜力,必须采用专门的内存技术和模式来说明其独特的体系结构。提出了几种技术,它们不仅可以显着提高性能,而且在许多情况下还可以简化实现。讨论的许多方法与内存中的数据组织或用于访问该数据的模式有关,而其他方法则详细说明了提高计算与内存访问比率的方法。提出的想法是通用的,并且整体上适用于细胞自动机模型。给出了一些问题的示例实现,包括生命游戏和高斯模糊,同时对性能特征(例如指令和内存访问计数)进行了分析和比较。详细的案例研究显示了将各种技术应用于较大的实际问题时的有效性。最后,解释了每个改进背后的原因,提供了确定给定技术何时最有效和最无效的一般准则。

著录项

  • 作者

    Balasalle, James.;

  • 作者单位

    University of Denver.;

  • 授予单位 University of Denver.;
  • 学科 Computer Science.
  • 学位 M.S.
  • 年度 2011
  • 页码 119 p.
  • 总页数 119
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号