首页> 外国专利> Hierarchical fault management in computer systems

Hierarchical fault management in computer systems

机译:计算机系统中的分层故障管理

摘要

Computer systems and methods of data processing are disclosed in which hierarchical levels of fault/event management are provided that intelligently monitor hardware and software and proactively take action in accordance with a defined fault policy. A fault policy based on a defined hierarchy ensures that for each particular type of failure the most appropriate action is taken. In one embodiment, a master Software Resiliency Manager (SRM) serves as the top hierarchical level fault/event manager, with one or more slave SRMs serving as the next hierarchical level fault/event manager. The software applications resident on each board can also include sub-processes (e.g., local resiliency managers or LRMs) that serve as the lowest hierarchical level fault/event managers.
机译:公开了数据处理的计算机系统和方法,其中提供了故障/事件管理的分级级别,该级别的故障/事件管理可以智能地监视硬件和软件,并根据定义的故障策略主动采取措施。基于定义的层次结构的故障策略可确保针对每种特定类型的故障采取最适当的措施。在一个实施例中,主软件弹性管理器(SRM)用作顶级层次故障/事件管理器,而一个或多个从属SRM用作下一层次故障/事件管理器。驻留在每个板上的软件应用程序还可以包括用作最低层级故障/事件管理器的子过程(例如,本地弹性管理器或LRM)。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号