Exploiting the Performance of 32 bit Floating Point Arithmetic in Obtaining 64 bit Accuracy (Revisiting Iterative Refinement for Linear Systems)

机译：利用32位浮点算术的性能在获得64位精度（Revisiting Linear Systems的迭代细化）

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Recent versions of microprocessors exhibit performance characteristics for 32 bit floating point arithmetic (single precision) that is substantially higher than 64 bit floating point arithmetic (double precision). Examples include the Intel??s Pentium IV and M processors, AMD??s Opteron architectures and the IBM??s Cell Broad Engine processor. When working in single precision, floating point operations can be performed up to two times faster on the Pentium and up to ten times faster on the Cell over double precision. The performance enhancements in these architectures are derived by accessing extensions to the basic architecture, such as SSE2 in the case of the Pentium and the vector functions on the IBM Cell. The motivation for this paper is to exploit single precision operations whenever possible and resort to double precision at critical stages while attempting to provide the full double precision results. The results described here are fairly general and can be applied to various problems in linear algebra such as solving large sparse systems, using direct or iterative methods and some eigenvalue problems. There are limitations to the success of this process, such as when the conditioning of the problem exceeds the reciprocal of the accuracy of the single precision computations. In that case the double precision algorithm should be used.

机译：最近的微处理器版本具有32位浮点算术（单精度）的性能特性，其基本上高于64位浮点算术（双精度）。例子包括英特尔的奔腾IV和M处理器，AMD ??的Opteron架构和IBM ?? S细胞广播发动机处理器。在单精度工作时，浮点操作可以在奔腾上更快地进行两倍，在电池上通过双精度更快地更快地进行十倍。这些体系结构中的性能增强是通过访问基本架构的扩展，例如在IBM小区上的奔腾和矢量函数的基本体系结构中的基本体系结构中的扩展来导出。本文的动机是尽可能利用单一精度操作，并在关键阶段进行双重精度，同时尝试提供完整的双精度结果。这里描述的结果是相当通用的，并且可以使用直接或迭代方法和一些特征值问题来应用线性代数中的各种问题，例如求解大稀疏系统。该过程的成功存在局限性，例如当问题的调节超过单精度计算的准确性的互动。在这种情况下，应使用双精度算法。

著录项

来源
《Association for Computing Machinery/Institute of Electrical and Electronics Engineers Conference Supercomputing》|2006年||共1页
会议地点
作者

展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
floating point arithmetic; iterative methods; linear algebra; mathematics computing; 32 bit floating point arithmetic; double precision algorithm; eigenvalue problems; iterative refinement; large sparse systems; linear systems; single precision computations;

机译：浮点算术;迭代方法;线性代数;数学计算;32位浮点算法;双精度算法;特征值问题;迭代细化;大稀疏系统;线性系统;单精度计算;

相似文献

外文文献
专利

1. A novel power efficient 0.64-GFlops fused 32-bit reversible floating point arithmetic unit architecture for digital signal processing applications [J] . AnanthaLakshmi A. V., Sudha Gnanou Florence Microprocessors and microsystems . 2017,第juna期

机译：面向数字信号处理应用的新型节能型0.64GFlops融合32位可逆浮点算术单元架构
2. Basic Operation Performed on Arithmetic Logic Unit (ALU) For 32-Bit Floating Point Numbers: (Initial Results) [J] . Shanthala. N, Nayana. M, Chandrashekar.C, International Journal of Electronics Engineering Research . 2017,第9期

机译：对于32位浮点数，在算术逻辑单元（ALU）上执行的基本操作：（初始结果）
3. Basic Operation Performed on Arithmetic Logic Unit (ALU) For 32-Bit Floating Point Numbers: (Initial Results) [J] . Shanthala N., Chandrashekar C., Siva Yellampalli, International Journal of Applied Engineering Research . 2017,第12aPta2期

机译：用于32位浮点数的算术逻辑单元（ALU）对基本操作:(初始结果）
4. Exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy (revisiting iterative refinement for linear systems) [C] . Julie Langou, Julien Langou, Piotr Luszczek, ACM/IEEE conference on Supercomputing . 2006

机译：利用32位浮点算法的性能来获得64位精度（重新考虑线性系统的迭代优化）
5. Architectural Design Space Exploration of Reciprocal and Square Root for Arbitrary Precision Fixed-point and Floating-point Arithmetic. [D] . Lin, Fang. 2015

机译：任意精确定点和浮点算法的平方根和平方根的建筑设计空间探索。
6. An Improved Pattern Synthesis Iterative Method in Planar Arrays for Obtaining Efficient Footprints with Arbitrary Boundaries [O] . Aarón Ángel Salas-Sánchez, Cibrán López-Álvarez, Juan Antonio Rodríguez-González, 2021

机译：平面阵列的改进图案综合迭代方法用于获得具有任意边界的有效占地面积
7. A 32-bit Logarithmic Arithmetic Unit and Its Performance Compared to Floating-Point [O] . J. N. Coleman, E. I. Chester 1999

机译：32位对数算术单元及其与浮点运算的性能

Exploiting the Performance of 32 bit Floating Point Arithmetic in Obtaining 64 bit Accuracy (Revisiting Iterative Refinement for Linear Systems)

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅