首页> 外文会议>International Conference on Parallel Processing >The robust middleware approach for transparent and systematic fault tolerance in parallel and distributed systems
【24h】

The robust middleware approach for transparent and systematic fault tolerance in parallel and distributed systems

机译:并行和分布式系统中透明和系统容错的强大中间件方法

获取原文

摘要

In this paper, we propose the robust middleware approach to transparent fault tolerance in parallel and distributed systems. The proposed approach inserts a robust middleware between algorithms/programs and system architecture/hardware. With the robust middleware, hardware faults are transparent to algorithms/programs so that ordinary algorithms/programs developed for fault-free networks can run on faulty parallel/distributed systems without modifications. Moreover, the robust middleware automatically adds fault tolerance capability to ordinary algorithms/programs so that no hardware redundancy or reconfiguration capability is required and no assumption is made about the availability of a complete subnetwork (at a lower dimension or smaller size). We also propose nomadic agent multithreaded programming as a novel fault-aware programming paradigm that is independent of network topologies and fault patterns. Nomadic agent multithreaded programming is adaptive to fault/traffic/workload patterns, and can take advantages of various components of the robust middleware, including the fault tolerance features and multiple embeddings, without relying on specialized robust algorithms.
机译:在本文中,我们提出了稳健的中间件方法,以在并行和分布式系统中透明的容错。所提出的方法在算法/程序和系统架构/硬件之间插入强大的中间件。与鲁棒中间件,硬件故障是透明的算法/方案使得对于无故障网络开发普通的算法/方案可以在不修改关于有故障的并行/分布式系统上运行。此外,强大的中间件自动增加了对普通算法/程序的容错能力,以便不需要硬件冗余或重配置能力,并且没有对完整子网的可用性(以较低的尺寸或更小的尺寸)进行假设。我们还将游牧代理多线程编程作为一种独立于网络拓扑和故障模式的新型故障感知编程范例。游牧代理多线程编程是对故障/流量/工作负载模式的自适应,可以采用强大的中间件的各种组件,包括容错功能和多个嵌入式,而无需依赖于专业的强大算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号