A recoverable distributed shared memory integrating coherence andrecoverability

机译：整合了一致性和可恢复性的可恢复分布式共享内存可恢复性

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Large-scale distributed systems are very attractive for theexecution of parallel applications requiring a huge computing power.However, their high probability of site failure is unacceptable,especially for long time running applications. In this paper, we addressthis problem and propose a checkpointing mechanism relying on arecoverable distributed shared memory (DSM) in order to tolerate singlenode failures. Although most recoverable DSMs require specific hardwareto store recovery data, our scheme uses standard memories to store bothcurrent and recovery data. Moreover, the management of recovery data ismerged with the management of current data by extending the DSM'scoherence protocol. This approach takes advantage of the datareplication provided by a DSM in order to limit the amount oftransferred pages during the checkpointing. The paper also presents animplementation and a preliminary performance evaluation of ourrecoverable DSM on a 56-node Intel Paragon

机译：大型分布式系统对此非常有吸引力执行需要巨大计算能力的并行应用。然而，它们的现场失败可能性是不可接受的，特别是对于长期运行应用程序。在本文中，我们地址这个问题并提出了依赖于一个检查点的机制可恢复的分布式共享内存（DSM），以容忍单个节点故障。虽然大多数可恢复的DSM都需要特定的硬件要存储恢复数据，我们的方案使用标准存储器来存储两者当前和恢复数据。此外，恢复数据的管理是通过扩展DSM的管理来与当前数据的管理合并一致性协议。这种方法利用了数据 DSM提供的复制以限制金额在检查点期间转移页面。本文还提供了一个实施和初步绩效评估在56节点Intel Paragon上可恢复的DSM

著录项

来源
《Fault-Tolerant Computing, 1995. FTCS-25. Digest of Papers., Twenty-Fifth International Symposium on》||p.289-298|共10页
会议地点
作者
Kermarrec A.-M.; Cabillic G.; Gefflaut A.; Morin C.; Puaut I.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. CACHE COHERENCE IN CENTRALIZED SHARED MEMORY AND DISTRIBUTED SHARED MEMORY ARCHITECTURES [J] . Sujit Deshpande, Priya Ravale, Sulabha Apte International Journal on Computer Science and Engineering . 2011,第Special期

机译：集中共享内存和分布式共享内存架构中的缓存一致性
2. On the correctness of program execution when cache coherence is maintained locally at data-sharing boundaries in distributed shared memory multiprocessors [J] . Sarojadevi H, Nandy SK, Balakrishnan S International journal of parallel programming . 2004,第5期

机译：关于在分布式共享内存多处理器中的数据共享边界本地保持缓存一致性时的程序执行的正确性
3. An efficient causal logging scheme for recoverable distributed shared memory systems [J] . Taesoon Park, Inseon Lee, Heon Y. Yeom Parallel Computing . 2002,第11期

机译：可恢复的分布式共享内存系统的有效因果日志记录方案
4. A recoverable distributed shared memory integrating coherence and recoverability [C] . Kermarrec, A.-M., Cabillic, Fault-Tolerant Computing, 1995. FTCS-25. Digest of Papers., Twenty-Fifth International Symposium on . 1995

机译：整合了一致性和可恢复性的可恢复分布式共享内存
5. Integrating transactions and distributed shared memory for distributed programming [D] . Souto, Pedro Alexandre Guimaraes Lobo Ferrei. 1999

机译：集成事务和分布式共享内存以进行分布式编程
6. Performance of parallel FDTD method for shared- and distributed-memory architectures: Application tobioelectromagnetics [O] . Miguel Ruiz-Cabello N., Maksims Abaļenkovs, Luis M. Diaz Angulo, 2020

机译：共享和分布式内存架构并行FDTD方法的性能：应用脚踏电磁
7. A Recoverable Distributed Shared Memory Integrating Coherence and Recoverability [O] . Kermarrec, Anne-Marie, Cabillic, Gilbert, Gefflaut, Alain, 1995

机译：整合了一致性和可恢复性的可恢复分布式共享内存
8. Recoverable Distributed Shared Memory Under Sequential and Relaxed Consistency [R] . Janssens, B., Fuchs, W. K. 1995

机译：序列和松弛一致性下的可恢复分布式共享内存

A recoverable distributed shared memory integrating coherence andrecoverability

摘要

著录项

相似文献

相关主题

期刊订阅