首页> 外文OA文献 >CLEAR: Cross-Layer Exploration for Architecting Resilience - Combining Hardware and Software Techniques to Tolerate Soft Errors in Processor Cores
【2h】

CLEAR: Cross-Layer Exploration for Architecting Resilience - Combining Hardware and Software Techniques to Tolerate Soft Errors in Processor Cores

机译:清除:建筑弹性的跨层探索 - 结合   容错处理器内核软错误的硬件和软件技术

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We present a first of its kind framework which overcomes a major challenge inthe design of digital systems that are resilient to reliability failures:achieve desired resilience targets at minimal costs (energy, power, executiontime, area) by combining resilience techniques across various layers of thesystem stack (circuit, logic, architecture, software, algorithm). This is alsoreferred to as cross-layer resilience. In this paper, we focus onradiation-induced soft errors in processor cores. We address both single-eventupsets (SEUs) and single-event multiple upsets (SEMUs) in terrestrialenvironments. Our framework automatically and systematically explores the largespace of comprehensive resilience techniques and their combinations acrossvarious layers of the system stack (586 cross-layer combinations in thispaper), derives cost-effective solutions that achieve resilience targets atminimal costs, and provides guidelines for the design of new resiliencetechniques. We demonstrate the practicality and effectiveness of our frameworkusing two diverse designs: a simple, in-order processor core and a complex,out-of-order processor core. Our results demonstrate that a carefully optimizedcombination of circuit-level hardening, logic-level parity checking, andmicro-architectural recovery provides a highly cost-effective soft errorresilience solution for general-purpose processor cores. For example, a 50ximprovement in silent data corruption rate is achieved at only 2.1% energy costfor an out-of-order core (6.1% for an in-order core) with no speed impact.However, selective circuit-level hardening alone, guided by a thorough analysisof the effects of soft errors on application benchmarks, provides acost-effective soft error resilience solution as well (with ~1% additionalenergy cost for a 50x improvement in silent data corruption rate).
机译:我们提出了同类框架中的第一个框架,该框架克服了可抵抗可靠性故障的数字系统设计中的一项重大挑战:通过在系统各层之间组合弹性技术,以最小的成本(能源,电力,执行时间,面积)实现所需的弹性目标堆栈(电路,逻辑,体系结构,软件,算法)。这也称为跨层弹性。在本文中,我们重点关注辐射引起的处理器内核中的软错误。我们解决了地形环境中的单事件多态性(SEU)和单事件多态性(SEMU)。我们的框架自动,系统地探索了广泛的综合弹性技术及其在系统堆栈各个层之间的组合(本文中为586个跨层组合),得出了具有成本效益的解决方案,以最小的成本实现了弹性目标,并为以下方面的设计提供了指导新的弹性技术。我们使用两种不同的设计展示了我们框架的实用性和有效性:简单的有序处理器内核和复杂的无序处理器内核。我们的结果表明,电路级强化,逻辑级奇偶校验和微体系结构恢​​复的精心优化组合为通用处理器内核提供了极具成本效益的软错误恢复解决方案。例如,无序内核的能源成本仅为2.1%(无序内核为6.1%),无声数据损坏率提高了50倍,但对速度没有影响。通过对软错误对应用程序基准的影响进行彻底的分析,还提供了一种经济高效的软错误恢复解决方案(将额外的能源成本提高了1%,从而使静默数据损坏率提高了50倍)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号