Reproducible and Accurate Matrix Multiplication

机译：可重复和准确的矩阵乘法

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Due to non-associativity of floating-point operations and dynamic scheduling on parallel architectures, getting a bit-wise reproducible floating-point result for multiple executions of the same code on different or even similar parallel architectures is challenging. In this paper, we address the problem of reproducibility in the context of matrix multiplication and propose an algorithm that yields both reproducible and accurate results. This algorithm is composed of two main stages: a filtering stage that uses fast vectorized floating-point expansions in conjunction with error-free transformations; an accumulation stage based on Kulisch long accumulators in a high-radix carry-save representation. Finally, we provide implementations and performance results in parallel environments like GPUs.

机译：由于浮点操作的非关联性和并行架构上的动态调度，在不同甚至类似的并行架构上获取多个执行相同代码的比特可重复的浮点结果是具有挑战性的。在本文中，我们解决了矩阵乘法背景下的再现性问题，并提出了一种算法，其产生可再现和准确的结果。该算法由两个主要阶段组成：过滤阶段，使用快速矢量化浮点扩展结合无差错变换;基于Kulisch长累加器的高基数携带储存表示的累积阶段。最后，我们提供了GPU等并行环境的实现和性能。

著录项

来源
《International Symposium on Scientiﬁcomputing, Computer Arithmetic, and Validated Numerics》|2016年||共12页
会议地点
作者
Roman Iakymchuk; David Defour; Sylvain Collange; Stef Graillat;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP3-53;
关键词
Matrix multiplication; Reproducibility; Accuracy; Kulisch long accumulator; Error-free transformation; Floating-point expansion; Rounding-to-nearest; GPUs;

机译：矩阵乘法;再现性;准确性;kulisch长累加器;无错误转换;浮点扩张;舍入到最接近;GPU;

相似文献

外文文献
中文文献
专利

1. A sparse matrix-vector multiplication based algorithm for accurate density matrix computations on systems of millions of atoms [J] . Ghale Purnima, Johnson Harley T. Computer physics communications . 2018,第期

机译：基于稀疏的矩阵矢量乘法算法，用于数百万原子系统的精确密度矩阵计算
2. Improvement of error-free splitting for accurate matrix multiplication [J] . Ozaki Katsuhisa, Ogita Takeshi, Oishi Shinichi Journal of Computational and Applied Mathematics . 2015,第Null期

机译：改进无差错分割以实现精确的矩阵乘法
3. Accurate cross-architecture performance modeling for sparse matrix-vector multiplication (SpMV) on GPUs [J] . Ping Guo, Liqiang Wang Concurrency, practice and experience . 2015,第13期

机译：GPU上的稀疏矩阵矢量乘法（SpMV）的准确跨体系结构性能建模
4. Reproducible and Accurate Matrix Multiplication [C] . Roman Iakymchuk, David Defour, Sylvain Collange, Scientific computing, computer arithmetic and validated numerics . 2016

机译：可再现且精确的矩阵乘法
5. Optimizing Tall-and-skinny Matrix-matrix Multiplication on GPUs [D] . Xiong, Nan 2018

机译：在GPU上优化高而瘦的矩阵矩阵乘法
6. HIERARCHICAL ORTHOGONAL MATRIX GENERATION AND MATRIX-VECTOR MULTIPLICATIONS IN RIGID BODY SIMULATIONS [O] . FUHUI FANG, JINGFANG HUANG, GARY HUBER, -1

机译：刚体模拟中的正交正交矩阵生成和矩阵向量乘法
7. Accurate cross-architecture performance modeling for sparse matrix-vector multiplication (SpMV) on GPUs [O] . Ping Guo, Liqiang Wang 2014

机译：GPU上稀疏矩阵 - 矢量乘法（SPMV）的准确交叉架构性能建模

Reproducible and Accurate Matrix Multiplication

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅