首页> 外国专利> Fault containment and error recovery in a scalable multiprocessor

Fault containment and error recovery in a scalable multiprocessor

机译:可伸缩多处理器中的故障遏制和错误恢复

摘要

A multi-processor computer system permits various types of partitions to be implemented to contain and isolate hardware failures. The various types of partitions include hard, semi-hard, firm, and soft partitions. Each partition can include one or more processors. Upon detecting a failure associated with a processor, the connection to adjacent processors in the system can be severed, thereby precluding corrupted data from contaminating the rest of the system. If an inter-processor connection is severed, message traffic in the system can become congested as messages become backed up in other processors. Accordingly, each processor includes various timers to monitor for traffic congestion that may be due to a severed connection. Rather than letting the processor continue to wait to be able to transmit its messages, the timers will expire at preprogrammed time periods and the processor will take appropriate action, such as simply dropping queued messages, to keep the system from locking up.
机译:多处理器计算机系统允许实现各种类型的分区,以包含和隔离硬件故障。各种类型的分区包括硬分区,半硬分区,硬分区和软分区。每个分区可以包括一个或多个处理器。一旦检测到与处理器相关的故障,就可以切断与系统中相邻处理器的连接,从而防止损坏的数据污染系统的其余部分。如果处理器间的连接断开,则系统中的消息流量可能会随着消息在其他处理器中的备份而变得拥塞。因此,每个处理器包括各种计时器,以监视可能由于断开连接而引起的流量拥塞。计时器将在预编程的时间段到期,而不是让处理器继续等待以能够发送其消息,并且处理器将采取适当的操作(例如简单地丢弃排队的消息)以防止系统锁定。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号