首页> 外文期刊>Journal of computational science >Exploiting data representation for fault tolerance
【24h】

Exploiting data representation for fault tolerance

机译:利用数据表示实现容错

获取原文
获取原文并翻译 | 示例

摘要

Incorrect computer hardware behavior may corrupt intermediate computations in numerical algorithms, possibly resulting in incorrect answers. Prior work models misbehaving hardware by randomly flipping bits in memory. We start by accepting this premise, and present an analytic model for the error introduced by a bit flip in an IEEE 754 floating-point number. We then relate this finding to the linear algebra concepts of normalization and matrix equilibration. In particular, we present a case study illustrating that normalizing both vector inputs of a dot product minimizes the probability of a single bit flip causing a large error in the dot product's result. Furthermore, the absolute error is either less than one or very large, which allows detection of large errors. Then, we apply this to the GMRES iterative solver. We count all possible errors that can be introduced through faults in arithmetic in the computationally intensive orthogonalization phase of GMRES, and show that when the matrix is equilibrated, the absolute error is bounded above by one. (C) 2016 Elsevier B.V. All rights reserved.
机译:错误的计算机硬件行为可能会破坏数值算法中的中间计算,从而可能导致错误的答案。现有工作模型通过随机翻转内存中的位来使硬件行为异常。我们首先接受这个前提,然后介绍一个针对IEEE 754浮点数中的位翻转引入的错误的解析模型。然后,我们将此发现与归一化和矩阵平衡的线性代数概念联系起来。特别是,我们提供了一个案例研究,说明对点积的两个向量输入进行归一化可最大程度地降低单个位翻转在点积结果中造成较大误差的可能性。此外,绝对误差小于1或很大,这允许检测大误差。然后,将其应用于GMRES迭代求解器。我们计算了在GMRES的计算密集型正交化阶段中通过算术故障可以引入的所有可能的误差,并表明,当矩阵均衡时,绝对误差被一个以上限制。 (C)2016 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号