【24h】

Optimizing shared cache behavior of chip multiprocessors

机译:优化芯片多处理器的共享缓存行为

获取原文

摘要

One of the critical problems associated with emerging chip multiprocessors (CMPs) is the management of on-chip shared cache space. Unfortunately, single processor centric data locality optimization schemes may not work well in the CMP case as data accesses from multiple cores can create conflicts in the shared cache space. The main contribution of this paper is a compiler directed code restructuring scheme for enhancing locality of shared data in CMPs. The proposed scheme targets the last level shared cache that exist in many commercial CMPs and has two components, namely, allocation, which determines the set of loop iterations assigned to each core, and scheduling, which determines the order in which the iterations assigned to a core are executed. Our scheme restructures the application code such that the different cores operate on shared data blocks at the same time, to the extent allowed by data dependencies. This helps to reduce reuse distances for the shared data and improves on-chip cache performance. We evaluated our approach using the Splash-2 and Parsec applications through both simulations and experiments on two commercial multi-core machines. Our experimental evaluation indicates that the proposed data locality optimization scheme improves inter-core conflict misses in the shared cache by 67% on average when both allocation and scheduling are used. Also, the execution time improvements we achieve (29% on average) are very close to the optimal savings that could be achieved using a hypothetical scheme.
机译:与新兴芯片多处理器(CMP)相关的关键问题之一是片上共享缓存空间的管理。不幸的是,以单处理器为中心的数据局部性优化方案在CMP情况下可能无法很好地工作,因为来自多个内核的数据访问会在共享缓存空间中产生冲突。本文的主要贡献是一种用于增强CMP中共享数据局部性的编译器定向代码重组方案。拟议的方案针对许多商业CMP中存在的最后一级共享缓存,它具有两个组件,即分配(该确定分配给每个核心的循环迭代集)和调度(确定分配给一个核的迭代顺序)。核心被执行。我们的方案对应用程序代码进行了重组,以使不同的内核在数据依赖关系允许的范围内同时对共享数据块进行操作。这有助于减少共享数据的重用距离,并提高片上缓存性能。我们通过在两台商用多核计算机上进行的仿真和实验,使用Splash-2和Parsec应用程序评估了我们的方法。我们的实验评估表明,当同时使用分配和调度时,所提出的数据局部性优化方案可使共享缓存中的内核间冲突丢失平均降低67%。同样,我们实现的执行时间改进(平均29%)非常接近使用假设方案可以实现的最佳节省。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号