...
首页> 外文期刊>ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages >Complexity Analysis and Algorithm Design for Reorganizing Data to Minimize Non-Coalesced Memory Accesses on GPU
【24h】

Complexity Analysis and Algorithm Design for Reorganizing Data to Minimize Non-Coalesced Memory Accesses on GPU

机译:数据重组以最大程度减少GPU上的非批量内存访问的复杂度分析和算法设计

获取原文
获取原文并翻译 | 示例

摘要

The performance of Graphic Processing Units (GPU) is sensitive to irregular memory references. Some recent work shows the promise of data reorganization for eliminating non-coalesced memory accesses that are caused by irregular references. However, all previous studies have employed simple, heuristic methods to determine the new data layouts to create. As a result, they either do not provide any performance guarantee or are effective to only some limited scenarios. This paper contributes a fundamental study to the problem. It systematically analyzes the inherent complexity of the problem in various settings, and for the first time, proves that the problem is NP-complete. It then points out the limitations of existing techniques and reveals that in practice, the essence for designing an appropriate data reorganization algorithm can be reduced to a tradeoff among space, time, and complexity. Based on that insight, it develops two new data reorganization algorithms to overcome the limitations of previous methods. Experiments show that an assembly composed of the new algorithms and a previous algorithm can circumvent the inherent complexity in finding optimal data layouts, making it feasible to minimize non-coalesced memory accesses for a variety of irregular applications and settings that are beyond the reach of existing techniques. Categories and Subject Descriptors D.3.4 [Programming Languages]: Processors-optimization, compilers General Terms Performance, Experimentation
机译:图形处理单元(GPU)的性能对不规则的内存引用很敏感。最近的一些工作表明了数据重组的希望,以消除由不规则引用引起的非协商内存访问。但是,所有以前的研究都采用简单的启发式方法来确定要创建的新数据布局。结果,它们要么不提供任何性能保证,要么仅对某些有限的情况有效。本文对该问题做出了基础研究。它系统地分析了各种情况下问题的内在复杂性,并首次证明了该问题是NP完全的。然后指出了现有技术的局限性,并揭示了在实践中,设计适当的数据重组算法的本质可以简化为空间,时间和复杂性之间的权衡。基于这一见解,它开发了两种新的数据重组算法来克服以前方法的局限性。实验表明,由新算法和先前算法组成的程序集可以规避固有的复杂性,以寻找最佳的数据布局,从而使针对非常规应用程序和设置的非高级内存访问最小化是可行的,这超出了现有技术的范围技术。类别和主题描述符D.3.4 [编程语言]:处理器优化,编译器通用术语性能,实验

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号