Algorithm-based fault tolerance on a hypercube multiprocessor

Banerjee P.; Rahmeh J.T.

首页> 外文期刊>IEEE Transactions on Computers >Algorithm-based fault tolerance on a hypercube multiprocessor

【24h】

Algorithm-based fault tolerance on a hypercube multiprocessor

机译：超立方体多处理器上基于算法的容错能力

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The design of fault-tolerant hypercube multiprocessor architecture is discussed. The authors propose the detection and location of faulty processors concurrently with the actual execution of parallel applications on the hypercube using a novel scheme of algorithm-based error detection. System-level error detection mechanisms have been implemented for three parallel applications on a 16-processor Intel iPSC hypercube multiprocessor: matrix multiplication, Gaussian elimination, and fast Fourier transform. Schemes for other applications are under development. Extensive studies have been done of error coverage of the system-level error detection schemes in the presence of finite-precision arithmetic, which affects the system-level encodings. Two reconfiguration schemes are proposed that allow the authors to isolate and replace faulty processors with spare processors.

机译：讨论了容错超立方体多处理器体系结构的设计。作者提出了一种使用基于算法的错误检测新方案，在超立方体上并行执行应用程序的同时，对故障处理器进行检测和定位。系统级错误检测机制已针对16处理器Intel iPSC超立方体多处理器上的三个并行应用程序实现：矩阵乘法，高斯消除和快速傅里叶变换。其他应用的方案正在开发中。在存在影响系统级编码的有限精度算法的情况下，已经对系统级错误检测方案的错误覆盖率进行了广泛的研究。提出了两种重新配置方案，允许作者隔离故障处理器并用备用处理器替换。

著录项

来源
《IEEE Transactions on Computers》 |1990年第9期|P.1132-1145|共14页
作者
Banerjee P.; Rahmeh J.T.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Subcube fault tolerance in hypercube multiprocessors [J] . Chang Y., Bhuyan L.N. IEEE Transactions on Computers . 1995,第9期

机译：超立方体多处理器中的子立方体容错
2. Tradeoffs in the design of efficient algorithm-based error detection schemes for hypercube multiprocessors [J] . Balasubramanian V., Banerjee P. IEEE Transactions on Software Engineering . 1990,第2期

机译：超立方体多处理器的基于算法的高效错误检测方案设计中的权衡
3. Design of algorithm-based fault-tolerant multiprocessor systems for concurrent error detection and fault diagnosis [J] . Vinnakota B., Jha N.K. IEEE Transactions on Parallel and Distributed Systems . 1994,第10期

机译：用于并行错误检测和故障诊断的基于算法的容错多处理器系统设计
4. Practical algorithm-based fault tolerant DFT system implementation on a hypercube multiprocessor [C] . San-Lung Sung, Redinbo, G.R. . 1993

机译：在超立方体多处理器上基于实用算法的容错DFT系统实现
5. System reliability through algorithm-based fault tolerance and reconfiguration. [D] . Ramanathan, Gowri. 1998

机译：通过基于算法的容错和重新配置来提高系统可靠性。
6. An improved CS-LSSVM algorithm-based fault pattern recognition of ship power equipments [O] . Yifei Yang, Minjia Tan, Yuewei Dai -1

机译：基于改进CS-LSSVM算法的船舶动力设备故障模式识别
7. Subcube Fault Tolerance in Hypercube Multiprocessors [O] . Yeimkuan Chang, Student Member, Laxmi N. Bhuyan, 1995

机译：超立方体多处理器中的子立方体容错

Algorithm-based fault tolerance on a hypercube multiprocessor

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅