Analyzing Memory Access on CPU-GPGPU Shared LLC Architecture

机译：分析CPU-GPGPU共享LLC架构上的内存访问

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The data exchange between GPGPUs and CPUs are becoming more and more important nowadays. One trend in industry to alleviate the long latency is to integrate CPUs and GPGPUs on a single chip. In this paper, we analyze the reference interactions between CPU and GPGPU applications with a CPU-GPGPU co-simulator that integrates the gem5 and gpgpu-sim together. Since the memory controllers are shared among all cores, we observe severe memory contention between them. The CPU applications suffer a 1.26x slowdown and 64.79% blocked time in main memory when they run parallels with GPGPU applications. To alleviate the contention and provide more memory band-width, shared last level caches (LLCs) are commonly employed in such systems. We test a banked shared LLC structure that implanted into the co-simulator. We show that a simple shared LLC contributes mostly to the GPGPU (2.13x to running alone and 1.7x to running in parallel), rather than CPU. With the help of LLC, the memory requests issued to main memory is reduced to 30.74%, the blocked time is reduced to 49.64%, which provides more memory bandwidth. The latency-sensitive CPU applications are suffered as the LLC buffer occupation is very high when they run with GPGPU in parallel. Besides, as the number of LLC cache bank grows, we reveal that CPU achieves higher speedup than GPGPUs by increasing LLC parallelism. Finally, we also discuss the impact of GPGPU L2 cache. And we find that fewer GPGPU L2 cache banks will lower the performance as they limits the parallelism of GPGPU. The observations and inferences in this paper may serve as a reference guide to future CPU-GPGPU shared LLC design.

机译：如今，GPGPU与CPU之间的数据交换变得越来越重要。减轻长等待时间的行业趋势是将CPU和GPGPU集成在单个芯片上。在本文中，我们使用将gem5和gpgpu-sim集成在一起的CPU-GPGPU协同仿真器来分析CPU和GPGPU应用程序之间的参考交互。由于内存控制器在所有内核之间共享，因此我们观察到它们之间存在严重的内存争用。与GPGPU应用程序并行运行时，CPU应用程序的主存储器内存下降1.26倍，阻塞时间为64.79％。为了减轻竞争并提供更多的内存带宽，在此类系统中通常使用共享的最后一级缓存（LLC）。我们测试了植入共享仿真器中的银行共享LLC结构。我们展示了一个简单的共享LLC对GPGPU的贡献最大（单独运行时为2.13倍，并行运行时为1.7倍），而不是CPU。借助LLC，发给主内存的内存请求减少到30.74％，阻塞时间减少到49.64％，从而提供更多的内存带宽。当与GPGPU并行运行时，LLC缓冲区占用非常高，因此对延迟敏感的CPU应用程序会受到影响。此外，随着LLC缓存库数量的增加，我们发现CPU通过提高LLC并行度实现了比GPGPU更高的提速。最后，我们还讨论了GPGPU L2缓存的影响。而且我们发现，较少的GPGPU L2高速缓存存储区会限制GPGPU的并行性，从而降低性能。本文中的观察和推论可作为将来CPU-GPGPU共享LLC设计的参考指南。

著录项

来源
《IEEE International Symposium on Parallel and Distributed Computing》|2015年|90-99|共10页
会议地点
作者
Jianliang Ma; Licheng Yu; Tianzhou Chen; Minghui Wu;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
CPU; GPGPU; memory; shared LLC;

机译：CPU; GPGPU;内存;共享LLC;

相似文献

外文文献
中文文献
专利

1. Exclusive Access to Resources in Distributed Shared Memory Architecture [J] . Ludwik Czaja Fundamenta Informaticae . 2012,第3a4期

机译：对分布式共享内存体系结构中资源的独占访问
2. CACHE COHERENCE IN CENTRALIZED SHARED MEMORY AND DISTRIBUTED SHARED MEMORY ARCHITECTURES [J] . Sujit Deshpande, Priya Ravale, Sulabha Apte International Journal on Computer Science and Engineering . 2011,第Special期

机译：集中共享内存和分布式共享内存架构中的缓存一致性
3. Benchmarking of a distributed-memory, high-order discontinuous finite element flow solver on a shared-memory parallel architecture [J] . Amjad Ali, Hamayun Farooq, Gullnaz Shahzadi, AIP Advances . 2020,第3期

机译：分布式内存的基准测试，在共享内存并行架构上的高阶不连续有限元流求解器
4. Analyzing Memory Access on CPU-GPGPU Shared LLC Architecture [C] . Jianliang Ma, Licheng Yu, Tianzhou Chen, IEEE International Symposium on Parallel and Distributed Computing . 2015

机译：分析CPU-GPGPU共享LLC架构的内存访问
5. Reducing memory access delays in large-scale shared-memory multiprocessors. [D] . Granston, Elana Denise. 1992

机译：减少大规模共享内存多处理器中的内存访问延迟。
6. Performance of parallel FDTD method for shared- and distributed-memory architectures: Application tobioelectromagnetics [O] . Miguel Ruiz-Cabello N., Maksims Abaļenkovs, Luis M. Diaz Angulo, 2020

机译：共享和分布式内存架构并行FDTD方法的性能：应用脚踏电磁
7. Visualizing the Memory Access Behavior of Shared Memory Applications on NUMA Architectures [O] . Jie Tao, Wolfgang Karl, Martin Schulz 2001

机译：可视化NUMA架构上共享内存应用程序的内存访问行为

Analyzing Memory Access on CPU-GPGPU Shared LLC Architecture

摘要

著录项

相似文献

相关主题

期刊订阅