首页> 外国专利> Apparatus and method for building distributed fault-tolerant/high-availability computed applications

Apparatus and method for building distributed fault-tolerant/high-availability computed applications

机译:用于构建分布式容错/高可用性计算应用程序的设备和方法

摘要

Software architecture for developing distributed fault-tolerant systems independent of the underlying hardware architecture and operating system. Systems built using architecture components are scalable and allow a set of computer applications to operate in fault-tolerant/high-availability mode, distributed processing mode, or many possible combinations of distributed and fault-tolerant modes in the same system without any modification to the architecture components. The software architecture defines system components that are modular and address problems in present systems. The architecture uses a System Controller, which controls system activation, initial load distribution, fault recovery, load redistribution, and system topology, and implements system maintenance procedures. An Application Distributed Fault-Tolerant/High-Availability Support Module (ADSM) enables an applications( ) to operate in various distributed fault-tolerant modes. The System Controller uses ADSM's well-defined API to control the state of the application in these modes. The Router architecture component provides transparent communication between applications during fault recovery and topology changes. An Application Load Distribution Module (ALDM) component distributes incoming external events towards the distributed application. The architecture allows for a Load Manager, which monitors load on various copies of the application and maximizes the hardware usage by providing dynamic load balancing. The architecture also allows for a Fault Manager, which performs fault detection, fault location, and fault isolation, and uses the System Controller's API to initiate fault recovery. These architecture components can be used to achieve a variety of distributed processing high-availability system configurations, which results in a reduction of cost and development time.
机译:用于开发独立于底层硬件体系结构和操作系统的分布式容错系统的软件体系结构。使用体系结构组件构建的系统是可伸缩的,并允许一组计算机应用程序在同一系统中以容错/高可用性模式,分布式处理模式或分布式和容错模式的许多可能组合运行,而无需对系统进行任何修改。体系结构组件。该软件体系结构定义了模块化的系统组件,并解决了当前系统中的问题。该体系结构使用系统控制器,该系统控制器控制系统激活,初始负载分配,故障恢复,负载重新分配和系统拓扑,并实施系统维护程序。应用程序分布式容错/高可用性支持模块(ADSM)使应用程序()可以在各种分布式容错模式下运行。系统控制器使用ADSM的定义良好的API在这些模式下控制应用程序的状态。路由器体系结构组件在故障恢复和拓扑更改期间提供应用程序之间的透明通信。应用程序负载分配模块(ALDM)组件将传入的外部事件分配给分布式应用程序。该架构允许使用负载管理器,该负载管理器可监视应用程序各种副本上的负载,并通过提供动态负载平衡来最大程度地利用硬件。该体系结构还允许使用故障管理器,该管理器执行故障检测,故障定位和故障隔离,并使用系统控制器的API来启动故障恢复。这些体系结构组件可用于实现各种分布式处理高可用性系统配置,从而减少成本和开发时间。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号