首页> 外文会议>Association for Computing Machinery/Institute of Electrical and Electronics Engineers Conference Supercomputing >Exploiting the Performance of 32 bit Floating Point Arithmetic in Obtaining 64 bit Accuracy (Revisiting Iterative Refinement for Linear Systems)
【24h】

Exploiting the Performance of 32 bit Floating Point Arithmetic in Obtaining 64 bit Accuracy (Revisiting Iterative Refinement for Linear Systems)

机译:利用32位浮点算术的性能在获得64位精度(Revisiting Linear Systems的迭代细化)

获取原文

摘要

Recent versions of microprocessors exhibit performance characteristics for 32 bit floating point arithmetic (single precision) that is substantially higher than 64 bit floating point arithmetic (double precision). Examples include the Intel??s Pentium IV and M processors, AMD??s Opteron architectures and the IBM??s Cell Broad Engine processor. When working in single precision, floating point operations can be performed up to two times faster on the Pentium and up to ten times faster on the Cell over double precision. The performance enhancements in these architectures are derived by accessing extensions to the basic architecture, such as SSE2 in the case of the Pentium and the vector functions on the IBM Cell. The motivation for this paper is to exploit single precision operations whenever possible and resort to double precision at critical stages while attempting to provide the full double precision results. The results described here are fairly general and can be applied to various problems in linear algebra such as solving large sparse systems, using direct or iterative methods and some eigenvalue problems. There are limitations to the success of this process, such as when the conditioning of the problem exceeds the reciprocal of the accuracy of the single precision computations. In that case the double precision algorithm should be used.
机译:最近的微处理器版本具有32位浮点算术(单精度)的性能特性,其基本上高于64位浮点算术(双精度)。例子包括英特尔的奔腾IV和M处理器,AMD ??的Opteron架构和IBM ?? S细胞广播发动机处理器。在单精度工作时,浮点操作可以在奔腾上更快地进行两倍,在电池上通过双精度更快地更快地进行十倍。这些体系结构中的性能增强是通过访问基本架构的扩展,例如在IBM小区上的奔腾和矢量函数的基本体系结构中的基本体系结构中的扩展来导出。本文的动机是尽可能利用单一精度操作,并在关键阶段进行双重精度,同时尝试提供完整的双精度结果。这里描述的结果是相当通用的,并且可以使用直接或迭代方法和一些特征值问题来应用线性代数中的各种问题,例如求解大稀疏系统。该过程的成功存在局限性,例如当问题的调节超过单精度计算的准确性的互动。在这种情况下,应使用双精度算法。

著录项

相似文献

  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号