首页> 外文会议> >The robust middleware approach for transparent and systematic fault tolerance in parallel and distributed systems
【24h】

The robust middleware approach for transparent and systematic fault tolerance in parallel and distributed systems

机译:鲁棒的中间件方法,可在并行和分布式系统中实现透明和系统的容错能力

获取原文

摘要

We propose the robust middleware approach to transparent fault tolerance in parallel and distributed systems. The proposed approach inserts a robust middleware between algorithms/programs and system architecture/hardware. With the robust middleware, hardware faults are transparent to algorithms/programs so that ordinary algorithms/programs developed for fault-free networks can run on faulty parallel/distributed systems without modifications. Moreover, the robust middleware automatically adds fault tolerance capability to ordinary algorithms/programs so that no hardware redundancy or reconfiguration capability is required and no assumption is made about the availability of a complete subnetwork (at a lower dimension or smaller size). We also propose nomadic agent multithreaded programming as a novel fault-aware programming paradigm that is independent of network topologies and fault patterns. Nomadic agent multithreaded programming is adaptive to fault/traffic/workload patterns, and can take advantages of various components of the robust middleware, including the fault tolerance features and multiple embeddings, without relying on specialized robust algorithms.
机译:我们提出了鲁棒的中间件方法,以解决并行和分布式系统中的透明容错问题。所提出的方法在算法/程序与系统架构/硬件之间插入了健壮的中间件。使用强大的中间件,硬件故障对于算法/程序是透明的,因此为无故障网络开发的普通算法/程序无需修改即可在故障的并行/分布式系统上运行。此外,强大的中间件会自动将容错功能添加到普通算法/程序中,从而无需硬件冗余或重新配置功能,并且无需对完整子网的可用性(较小尺寸或较小尺寸)做出任何假设。我们还提出了游牧代理多线程编程作为一种新颖的故障感知编程范例,该范例独立于网络拓扑和故障模式。 Nomadic Agent多线程编程可适应故障/流量/工作负载模式,并且可以利用强大的中间件的各种组件,包括容错功能和多重嵌入,而无需依赖于专门的强大算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号