【24h】

In-Network Cache Coherence

机译:网络内缓存一致性

获取原文

摘要

With the trend towards increasing number of processor cores in future chip architectures, scalable directory-based protocols for maintaining cache coherence will be needed. However, directory-based protocols face well-known problems in delay and scalability. Most current protocol optimizations targeting these problems maintain a firm abstraction of the interconnection network fabric as a communication medium: protocol optimizations consist of endto- end messages between requestor, directory and sharer nodes, while network optimizations separately target lowering communication latency for coherence messages. In this paper, we propose an implementation of the cache coherence protocol within the network, embedding directories within each router node that manage and steer requests towards nearby data copies, enabling in-transit optimization of memory access delay. Simulation results across a range of SPLASH-2 benchmarks demonstrate significant performance improvement and good system scalability, withup to 44.5% and 56% savings in average memory access latency for 16 and 64-node systems, respectively, when compared against the baseline directory cache coherence protocol. Detailed microarchitecture and implementation characterization affirms the low area and delay impact of in-network coherence.
机译:随着未来芯片架构中处理器核数量的增长趋势,将需要用于维护高速缓存一致性的可伸缩的基于目录的协议。但是,基于目录的协议在延迟和可伸缩性方面面临着众所周知的问题。当前针对这些问题的大多数协议优化都将互连网络结构作为通信介质保持了牢固的抽象:协议优化包括请求者,目录和共享者节点之间的端到端消息,而网络优化分别旨在降低一致性消息的通信等待时间。在本文中,我们提出了一种在网络内实现缓存一致性协议的方法,该方法将目录嵌入到每个路由器节点中,这些目录可管理请求并将其引导至附近的数据副本,从而实现内存访问延迟的传输中优化。一系列SPLASH-2基准测试的仿真结果显示出显着的性能改进和良好的系统可扩展性,与基线目录缓存一致性相比,16和64节点系统的平均内存访问延迟分别节省了44.5%和56%协议。详细的微体系结构和实现特征证实了网络内一致性的低面积和延迟影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号