Fault tolerance is explored for spacecraft computers employing Field-Programmable Gate Arrays (FPGAs). Techniques are investigated for tolerating Single Event Upsets (SEUs) caused by radiation in the space environment. A new architectural approach is proposed for achieving SEU tolerance that minimizes power and size overhead costs by reducing the precision with which error checking is done. This Reduced Precision Redundancy (RPR) approach is compared to the traditional Triple Modular Redundancy (TMR) method. A methodology is presented for quantifying the costs and benefits of various performance factors, and thereby determining optimal design solutions. This methodology considers reliability as a performance factor that can be traded-off against factors such as power, size and speed. An SEU simulation system is developed for studying the effect of SEUs on actual FPGA circuits. Live proton radiation testing and computer-controlled fault injection simulations demonstrate the effectiveness of RPR and TMR. Computer simulations of power usage demonstrate the savings achieved with RPR. RPR is as reliable as TMR while requiring 1/3 to 1/2 as much power. The effect of imprecise computations that may be produced by an RPR system is studied. An image processing application illustrates the type of problems for which RPR can be applied effect
展开▼