首页> 外文会议>Conference on Computing frontiers >An efficient cache design for scalable glueless shared-memory multiprocessors
【24h】

An efficient cache design for scalable glueless shared-memory multiprocessors

机译:用于可伸缩的无胶共享内存多处理器的高效缓存设计

获取原文

摘要

Traditionally, cache coherence in large-scale shared-memory multiprocessors has been ensured by means of a distributed directory structure stored in main memory. In this way, the access to main memory to recover the sharing status of the block is generally put in the critical path of every cache miss, increasing its latency. Considering the ever-increasing distance to memory, these cache coherence protocols are far from being optimal from the perspective of performance. On the other hand, shared-memory multiprocessors formed by connecting chips that integrate the processor, caches, coherence logic, switch and memory controller through a low-cost, low-latency point-to-point network (glueless shared-memory multiprocessors) are a reality.In this work, we propose a novel design for the L2 cache level, at which coherence has to be maintained, aimed at being used in glueless shared-memory multiprocessors. Our proposal splits the cache structure into two different parts: one for storing data and directory information for the blocks requested by the local processor, and another one for storing only directory information for blocks accessed by remote processors. Using this cache scheme we remove the directory from main memory. Besides saving memory space, our proposal brings very significant reductions in terms of latency of the cache misses (speed-ups of 3.0 on average), which translate into reductions in applications' execution time of 31% on average.
机译:传统上,大型共享内存多处理器中的缓存一致性是通过存储在主内存中的分布式目录结构来确保的。以此方式,通常将对主存储器的访问以恢复块的共享状态放置在每个高速缓存未命中的关键路径中,从而增加其等待时间。考虑到到内存的距离不断增加,从性能的角度来看,这些高速缓存一致性协议远非最佳。另一方面,通过低成本,低延迟的点对点网络连接集成了处理器,高速缓存,一致性逻辑,交换器和存储器控制器的芯片而形成的共享内存多处理器(无胶共享内存多处理器)是在这项工作中,我们为L2高速缓存级别提出了一种新颖的设计,该设计必须保持一致性,旨在用于无胶共享内存多处理器中。我们的建议将缓存结构分为两个不同的部分:一个用于存储本地处理器请求的块的数据和目录信息,另一个用于仅存储远程处理器访问的块的目录信息。使用这种缓存方案,我们从主内存中删除了目录。除了节省内存空间外,我们的建议还大大降低了缓存未命中的延迟(平均速度提高了3.0),这意味着应用程序的执行时间平均减少了31%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号