首页> 外文学位 >Software modulated fault tolerance.
【24h】

Software modulated fault tolerance.

机译:软件调制的容错能力。

获取原文
获取原文并翻译 | 示例

摘要

In recent decades, microprocessor performance has been increasing exponentially, due in large part to smaller and faster transistors enabled by improved fabrication technology. While such transistors yield performance enhancements, their smaller size and sheer number make chips more susceptible to transient faults. Designers frequently introduce redundant hardware or software to detect these faults because process and material advances are often insufficient to mitigate their effect. Regardless of the methods used for adding reliability, these techniques incur significant performance penalties because they uniformly protect the entire application. They do not consider the varying resilience to transient faults of different program regions. This uniform protection leads to wasted resources that reduce performance and/or increase cost.;To maximize fault coverage while minimizing the performance impact, this dissertation takes advantage of the key insights that not all faults in an unprotected application will cause an incorrect answer and not all parts of the program respond the same way to reliability techniques. First, this dissertation demonstrates the varying vulnerability and performance responses of an application and identifies regions which are most susceptible to faults as well as those which are inexpensive to protect. Second, this dissertation advocates the use of software and hybrid approaches to fault tolerance to enable the synergy of high-level information with specific redundancy techniques. Third, this dissertation demonstrates how to exploit this non-uniformity via Software-Modulated Fault Tolerance. Software-Modulated Fault Tolerance leverages reliability and performance information at a high level and directs the reliability choices at fine granularities to provide the most efficient use of processor resources for an application.
机译:在最近的几十年中,微处理器的性能呈指数级增长,这在很大程度上归功于改进的制造技术可实现更小,更快的晶体管。尽管此类晶体管可提高性能,但它们的较小尺寸和纯粹的数量使芯片更容易受到瞬态故障的影响。设计人员经常引入冗余的硬件或软件来检测这些故障,因为过程和材料的进步通常不足以减轻其影响。不管使用哪种方法来增加可靠性,这些技术都将导致显着的性能损失,因为它们可以统一保护整个应用程序。他们没有考虑不同程序区域对瞬态故障的适应能力。这种统一的保护导致浪费的资源,从而降低性能和/或增加成本。为了最大化故障覆盖范围,同时最大程度地降低性能影响,本文利用了以下关键见解:并非未受保护的应用程序中的所有故障都将导致错误的答案,而并非程序的所有部分对可靠性技术的响应方式相同。首先,本文论证了应用程序不断变化的脆弱性和性能响应,并确定了最容易出现故障以及保护成本不高的区域。其次,本文主张使用软件和混合方法来容错,以使高级信息与特定的冗余技术协同作用。第三,本文演示了如何通过软件调制的容错来利用这种不均匀性。软件调制的容错功能在较高水平上利用了可靠性和性能信息,并以精细的粒度指导了可靠性选择,从而为应用程序提供了最有效的处理器资源利用。

著录项

  • 作者

    Reis, George A., III.;

  • 作者单位

    Princeton University.;

  • 授予单位 Princeton University.;
  • 学科 Engineering Electronics and Electrical.
  • 学位 Ph.D.
  • 年度 2008
  • 页码 104 p.
  • 总页数 104
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 无线电电子学、电信技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号