首页> 外文期刊>Journal of supercomputing >SDAM: a combined stack distance-analytical modeling approach to estimate memory performance in GPUs
【24h】

SDAM: a combined stack distance-analytical modeling approach to estimate memory performance in GPUs

机译:SDAM:一个组合的堆栈距离分析建模方法来估算GPU中的内存性能

获取原文
获取原文并翻译 | 示例

摘要

Graphics processing units (GPUs) are powerful in performing data-parallel applications. Such applications most often rely on the GPU's memory hierarchy to deliver high performance. Designing efficient memory hierarchy for GPUs is a challenging task because of its wide architectural space. To moderate this challenge, this paper proposes a framework, called stack distance-analytic modeling (SDAM), to estimate memory performance of the GPU in terms of memory cycle counts. Providing the input data to the model is crucial in terms of the accuracy of the input data, and the time spent to obtain them. SDAM employs the stack distance analysis method and analytical modeling to obtain the required input accurately and swiftly. Further, it employs a detailed analytical model to estimate memory cycles. SDAM is validated against real GPU executions. Further, it is compared with a cycle accurate simulator. The experimental evaluations, performed on a set of memory-intensive benchmarks, prove that SDAM is faster and more accurate than cycle-accurate simulation, thus it can facilitate the GPU cache design-space exploration. For a selection of data-intensive benchmarks, SDAM showed a 32% average error in estimating memory data transfer cycles in a modern GPU, which outperforms cycle-accurate simulation, while it is an order of magnitude faster than the cycle-accurate simulation. Finally, the applicability of SDAM in exploring cache design-space in GPUs is demonstrated through experimenting with various cache designs.
机译:图形处理单元(GPU)在执行数据并行应用程序方面是强大的。这些应用程序通常依赖于GPU的内存层次结构来提供高性能。由于其宽敞的建筑空间,设计GPU的高效内存层次结构是一个具有挑战性的任务。为了适度这一挑战,本文提出了一种称为堆栈距离 - 分析建模(SDAM)的框架,以估计GPU的内存循环计数。向模型提供输入数据在输入数据的准确性方面至关重要,并且所花费的时间。 SDAM采用堆栈距离分析方法和分析建模,以准确且迅速地获得所需的输入。此外,它采用详细的分析模型来估计存储器周期。 SDAM针对真正的GPU执行验证。此外,将其与循环精确模拟器进行比较。在一组内存密集型基准上进行的实验评估证明了SDAM比周期准确仿真更快,更准确,因此它可以促进GPU缓存设计空间探索。对于选择的数据密集型基准测试,SDAM在现代GPU中估算内存数据传输周期的平均误差显示了32%的平均误差,这优于周期准确的仿真,而它比周期准确仿真更快的数量级。最后,通过尝试各种缓存设计来证明SDAM在GPU中探索高速缓存设计空间的适用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号