【24h】

ASR

机译:ASR

获取原文

摘要

The large working sets of commercial and scientific workloads stress the L2 caches of Chip Multiprocessors (CMPs). Some CMPs use a shared L2 cache to maximize the on-chip cache capacity and minimize off-chip misses. Others use private L2 caches, replicating data to limit the delay due to global wires and minimize cache access time. Recent hybrid proposals use selective replication to balance latency and capacity, but their static replication rules result in performance degradation for some combinations of workloads and system configurations. This paper proposes Adaptive Selective Replication (ASR), a mechanism that dynamically monitors workload behavior to control replication. ASR replicates cache blocks only when it estimates the benefit of replication (lower L2 hit latency) exceeds the cost (more L2 misses). Full-system simulations of 8-processor CMPs show that ASR provides robust performance: improving performance by as much as 29% versus shared caches, 19% versus private caches, and 12% versusCMP-NuRapid [9] and Victim Replication [41]. Furthermore, while ASR does not improve the performance of all workloads, it provides performance stability by always performing at least comparably to the best alternative including Cooperative Caching [8].
机译:商业和科学工作量的大工作量给芯片多处理器(CMP)的L2缓存带来压力。一些CMP使用共享的L2高速缓存来最大化片上高速缓存容量并最小化片外未命中率。其他人则使用私有L2缓存,通过复制数据来限制由于全局连线引起的延迟并最大程度地减少缓存访问时间。最近的混合提议使用选择性复制来平衡延迟和容量,但是它们的静态复制规则会导致某些工作负载和系统配置组合的性能下降。本文提出了自适应选择性复制(ASR),一种动态监视工作负载行为以控制复制的机制。 ASR仅在估计复制的好处(较低的L2命中延迟)超过成本(更多的L2丢失)时才复制缓存块。对8个处理器CMP的全系统仿真显示,ASR提供了强大的性能:与共享缓存相比,性能提高了29%,与专用缓存相比,性能提高了19%,而与CMP-NuRapid [9]和受害者复制[41]相比,性能提高了12%。此外,尽管ASR不能提高所有工作负载的性能,但它通过始终至少与包括协作缓存在内的最佳替代方案相提并论来提供性能稳定性[8]。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号