首页> 外文期刊>Neurocomputing >A dual process redundancy approach to transient fault tolerance for ccNUMA architecture
【24h】

A dual process redundancy approach to transient fault tolerance for ccNUMA architecture

机译:用于ccNUMA架构的瞬态容错的双过程冗余方法

获取原文
获取原文并翻译 | 示例
           

摘要

Transient fault is a critical concern in the reliability of microprocessor system. The software fault tolerance is more flexible and lower in cost than the hardware fault tolerance. And also, as architectural trends point toward multicore designs, there is substantial interest in adapting parallel and redundancy hardware resources for transient fault tolerance. The paper proposes a process-level fault tolerance technique, a software-centric approach, which efficiently schedules and synchronizes redundancy processes with ccNUMA processors redundancy. So it can improve efficiency of redundancy processes running and reduce time and space overhead. The paper focuses on the researching of redundancy processes error detection and handling method. A real prototype is implemented that is designed to be transparent to the application. The test results show that the system can timely detect soft errors of CPU and memory that cause the redundancy processes exception, and meanwhile ensure that the services of the application are uninterrupted and delayed shortly.
机译:瞬态故障是微处理器系统可靠性中的关键问题。软件容错能力比硬件容错能力更灵活,成本更低。而且,随着体系结构趋势朝着多核设计的方向发展,人们对将并行和冗余硬件资源用于瞬态容错具有极大的兴趣。本文提出了一种过程级的容错技术,一种以软件为中心的方法,该方法可有效地调度冗余过程并使其与ccNUMA处理器冗余同步。因此,它可以提高冗余流程的运行效率,并减少时间和空间开销。本文重点研究冗余过程的错误检测与处理方法。实现了一个真实的原型,该原型被设计为对应用程序透明。测试结果表明,该系统能够及时发现导致冗余进程异常的CPU和内存软错误,同时确保应用程序的服务不被中断和短暂延迟。

著录项

  • 来源
    《Neurocomputing》 |2013年第25期|50-57|共8页
  • 作者单位

    Department of Computer Science & Technology, Xi'an Jiaotong University, Xi'an, China;

    Inspur (Beijing) Electronic Information Industry Co. Ltd, Beijing, China;

    Department of Computer Science & Engineering, Shanghai Jiaotong University, Shanghai, China;

    Department of Computer Science & Technology, Xi'an Jiaotong University, Xi'an, China;

    Department of Computer Science & Technology, Xi'an Jiaotong University, Xi'an, China;

    Department of Computer Science & Technology, Xi'an Jiaotong University, Xi'an, China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Transient fault; CcNUMA; Dual-process; Redundancy;

    机译:暂态故障CcNUMA;双处理;冗余;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号