首页>
外国专利>
System and method for comprehensive availability management in a high-availability computer system
System and method for comprehensive availability management in a high-availability computer system
展开▼
机译:用于高可用性计算机系统中的全面可用性管理的系统和方法
展开▼
页面导航
摘要
著录项
相似文献
摘要
A system and method for availability management coordinates operational states of components to implement a desired redundancy model within a high-availability computing system. Within the availability management system, an availability manager monitors various reports on the status of components and nodes within the system. The availability manager uses these reports to direct components to change states if necessary, in order to maintain the desired system redundancy model. The availability management system includes a health monitor for performing component status audits upon individual components and reporting component status changes. The system also includes a watch-dog timer, which monitors the health monitor and reboots the entire node containing the health monitor if it becomes non-responsive. Each node within the system also includes a cluster membership monitor, which monitors nodes becoming non-responsive and reports node non-responsive errors. The availability management system also includes a multicomponent error correlator (MCEC), which uses pre-specified rules to correlate multiple specific and non-specific errors and infer a particular component problem. If a particular component problem is found, the MCEC reports a component status change to the availability manager. IMAGE
展开▼