首页> 外文期刊>IEEE Transactions on Parallel and Distributed Systems >The performance of cache-based error recovery in multiprocessors
【24h】

The performance of cache-based error recovery in multiprocessors

机译:多处理器中基于缓存的错误恢复的性能

获取原文
获取原文并翻译 | 示例

摘要

Several variations of cache-based checkpointing for rollback error recovery from transient errors in shared-memory multiprocessors have been recently developed. By modifying the cache replacement policy, these techniques use the inherent redundancy in the memory hierarchy to periodically checkpoint the computation state. Three schemes, different in the manner in which they avoid rollback propagation, are evaluated in this paper. By simulation with address traces from parallel applications running on an Encore Multimax shared-memory multiprocessor, we evaluate the performance effect of integrating the recovery schemes in the cache coherence protocol. Our results indicate that the cache-based schemes can provide checkpointing capability with low performance overhead, but with uncontrollable high variability in the checkpoint interval.
机译:最近已经开发了几种基于缓存的检查点变体,用于从共享内存多处理器中的瞬时错误中恢复回滚错误。通过修改缓存替换策略,这些技术使用内存层次结构中的固有冗余来定期检查点计算状态。本文评估了三种方案,它们在避免回滚传播方面的方式有所不同。通过使用Encore Multimax共享内存多处理器上运行的并行应用程序的地址跟踪进行仿真,我们评估了将恢复方案集成到缓存一致性协议中的性能效果。我们的结果表明,基于缓存的方案可以提供具有较低性能开销的检查点功能,但在检查点间隔中具有不可控制的高可变性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号