...
首页> 外文期刊>Acta Numerica >Linear algebra software for large-scale accelerated multicore computing
【24h】

Linear algebra software for large-scale accelerated multicore computing

机译:用于大规模加速多核计算的线性代数软件

获取原文
           

摘要

Many crucial scientific computing applications, ranging from national security to medical advances, rely on high-performance linear algebra algorithms and technologies, underscoring their importance and broad impact. Here we present the state-of-the-art design and implementation practices for the acceleration of the predominant linear algebra algorithms on large-scale accelerated multicore systems. Examples are given with fundamental dense linear algebra algorithms - from the LU, QR, Cholesky, and LDLT factorizations needed for solving linear systems of equations, to eigenvalue and singular value decomposition (SVD) problems. The implementations presented are readily available via the open-source PLASMA and MAGMA libraries, which represent the next generation modernization of the popular LAPACK library for accelerated multicore systems. To generate the extreme level of parallelism needed for the efficient use of these systems, algorithms of interest are redesigned and then split into well-chosen computational tasks. The task execution is scheduled over the computational components of a hybrid system of multicore CPUs with GPU accelerators and/or Xeon Phi coprocessors, using either static scheduling or light-weight runtime systems. The use of light-weight runtime systems keeps scheduling overheads low, similar to static scheduling, while enabling the expression of parallelism through sequential-like code. This simplifies the development effort and allows exploration of the unique strengths of the various hardware components. Finally, we emphasize the development of innovative linear algebra algorithms using three technologies - mixed precision arithmetic, batched operations, and asynchronous iterations - that are currently of high interest for accelerated multicore systems.
机译:从国家安全到医学进步,许多关键的科学计算应用都依赖于高性能的线性代数算法和技术,突显了它们的重要性和广泛的影响。在这里,我们介绍了在大型加速多核系统上加速主要线性代数算法的最新设计和实现方法。示例给出了基本的密集线性代数算法-从求解方程组线性系统所需的LU,QR,Cholesky和LDLT分解,到特征值和奇异值分解(SVD)问题。可以通过开源的PLASMA和MAGMA库轻松获得所介绍的实现,这代表了流行的LAPACK库的下一代现代化,用于加速多核系统。为了产生有效使用这些系统所需的极端并行度,需要对目标算法进行重新设计,然后将其拆分为精心选择的计算任务。使用静态调度或轻量级运行时系统,在具有GPU加速器和/或Xeon Phi协处理器的多核CPU混合系统的计算组件上调度任务执行。类似于静态调度,轻量级运行时系统的使用使调度开销较低,同时可以通过类似顺序的代码来表达并行性。这简化了开发工作,并允许探索各种硬件组件的独特优势。最后,我们强调使用三种技术-混合精度算术,批处理运算和异步迭代-开发创新的线性代数算法,这些技术目前对于加速多核系统非常重要。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号