首页> 外文期刊>Computer architecture news >Decoupled Store Completion/Silent Deterministic Replay: Enabling Scalable Data Memory for CPR/CFP Processors
【24h】

Decoupled Store Completion/Silent Deterministic Replay: Enabling Scalable Data Memory for CPR/CFP Processors

机译:解耦的商店完成/静音确定性重放:为CPR / CFP处理器启用可扩展数据存储器

获取原文
获取原文并翻译 | 示例

摘要

CPR/CFP (Checkpoint Processing and Recovery/Continual Flow Pipeline) support an adaptive instruction window that scales to tolerate last-level cache misses. CPR/CFP scale the register file by aggressively reclaiming the destination registers of many in-flight instructions. However, an analogous mechanism does not exist for stores and loads. As the window expands, CPR/CFP processors must track all in-flight stores and loads to support forwarding and detect memory ordering violations.rnThe previously-described SVW (Store Vulnerability Window) and SQIP (Store Queue Index Prediction) schemes provide scalable, non-associative load and store queues, respectively. However, they don't work smoothly in a CPR/CFP context. SVW/SQIP rely on the ability to dynamically stall some loads until a specific older store writes to the cache. Enforcing this serialization in CPR/CFP is expensive if the load and store are in the same checkpoint.rnWe introduce two complementary procedures that implement this serialization efficiently. Decoupled Store Completion (DSC) allows stores to write to the cache before the enclosing checkpoint completes execution. Silent Deterministic Replay (SDR) supports mis-speculation recovery in the presence of DSC by replaying loads older than completed stores using values from the load queue. The combination of DSC and SDR enables an SVW/SQIP based CPR/CFP memory system that outperforms previous designs while occupying less area.
机译:CPR / CFP(检查点处理和恢复/连续流管道)支持自适应指令窗口,该窗口可缩放以容忍最后一级的高速缓存未命中。 CPR / CFP通过积极回收许多运行中指令的目标寄存器来缩放寄存器文件。但是,对于存储和装载不存在类似的机制。随着窗口的扩大,CPR / CFP处理器必须跟踪所有运行中的存储和负载以支持转发和检测内存排序违规。rn先前描述的SVW(存储漏洞窗口)和SQIP(存储队列索引预测)方案提供了可扩展的,非存储的-分别关联加载和存储队列。但是,它们在CPR / CFP环境中无法正常工作。 SVW / SQIP依赖于动态停止某些负载直到特定的较旧存储写入缓存的能力。如果加载和存储在同一个检查点中,则在CPR / CFP中执行此序列化将非常昂贵。我们引入了两个互补的过程来有效地实现此序列化。解耦存储完成(DSC)允许存储在封闭的检查点完成执行之前写入高速缓存。静默确定性重放(SDR)通过使用装入队列中的值重放比完成的存储更早的装入,从而在存在DSC时支持错误推测恢复。 DSC和SDR的组合使基于SVW / SQIP的CPR / CFP存储器系统性能优于先前的设计,同时占用的空间更少。

著录项

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号