A Runtime System for Programming Out-of-Core Matrix Algorithms-by-Tiles on Multithreaded Architectures

GREGORIO QUINTANA-ORTI; FRANCISCO D. IGUAL; MERCEDES MARQUES; ENRIQUE S. QUINTANA-ORTI; ROBERT A. VAN DE GEIJN

首页> 外文期刊>ACM transactions on mathematical software >A Runtime System for Programming Out-of-Core Matrix Algorithms-by-Tiles on Multithreaded Architectures

【24h】

A Runtime System for Programming Out-of-Core Matrix Algorithms-by-Tiles on Multithreaded Architectures

机译：在多线程体系结构上按层对核心矩阵算法进行编程的运行时系统

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Out-of-core implementations of algorithms for dense matrix computations have traditionally focused on optimal use of memory so as to minimize I/O, often trading programmability for performance. In this article we show how the current state of hardware and software allows the programmability problem to be addressed without sacrificing performance. This comes from the realizations that memory is cheap and large, making it less necessary to optimally orchestrate I/O, and that new algorithms view matrices as collections of submatrices and computation as operations with those submatrices. This enables libraries to be coded at a high level of abstraction, leaving the tasks of scheduling the computations and data movement in the hands of a runtime system. This is in sharp contrast to more traditional approaches that leverage optimal use of in-core memory and, at the expense of introducing considerable programming complexity, explicit overlap of I/O with computation. Performance is demonstrated for this approach on multicore architectures as well as platforms equipped with hardware accelerators.

机译：传统上，用于密集矩阵计算的算法的核外实现通常集中在内存的最佳使用上，以使I / O最小化，通常是为了性能而牺牲可编程性。在本文中，我们展示了硬件和软件的当前状态如何在不牺牲性能的情况下解决了可编程性问题。这是由于人们认识到内存便宜又大，因此无需最佳地编排I / O，并且新算法将矩阵视为子矩阵的集合，并将计算视为对这些子矩阵的操作。这使库可以以较高的抽象级别进行编码，而将调度计算和数据移动的任务交给了运行时系统。这与更传统的方法形成了鲜明的对比，这些方法利用了内核内存的最佳利用，并且以引入大量编程复杂性为代价，使I / O与计算明显重叠。这种方法在多核体系结构以及配备了硬件加速器的平台上的性能得到了证明。

著录项

来源
《ACM transactions on mathematical software》 |2012年第4期|25.1-25.25|共25页
作者
GREGORIO QUINTANA-ORTI; FRANCISCO D. IGUAL; MERCEDES MARQUES; ENRIQUE S. QUINTANA-ORTI; ROBERT A. VAN DE GEIJN;
展开▼
作者单位

Departamento de Ingenieria y Ciencia de Computadores, Universidad Jaume I, 12.071-Castellon, Spain;

Departamento de Ingenieria y Ciencia de Computadores, Universidad Jaume I, 12.071-Castellon, Spain;

Departamento de Ingenieria y Ciencia de Computadores, Universidad Jaume I, 12.071-Castellon, Spain;

Departamento de Ingenieria y Ciencia de Computadores, Universidad Jaume I, 12.071-Castellon, Spain;

Department of Computer Science, The University of Texas at Austin, Austin, TX 78712;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
high-performance; libraries; linear algebra; multithreaded architectures; out-of-core algorithms;

机译：高性能图书馆;线性代数多线程架构;核外算法;

相似文献

外文文献
中文文献
专利

1. Out-of-core macromolecular simulations on multithreaded architectures [J] . José I. Aliaga, José M. Badía, Maribel Castillo, Concurrency and computation: practice and experience . 2015,第6期

机译：多线程架构上的核外大分子模拟
2. Runtime deadlock tracking and prevention of concurrent multithreaded programs: A learning-based approach [J] . Concurrency, practice and experience . 2020,第10期

机译：运行时死锁跟踪和防止并发多线程程序：一种基于学习的方法
3. Runtime analysis of atomicity for multithreaded programs [J] . Wang L., Stoller S.D. IEEE Transactions on Software Engineering . 2006,第2期

机译：多线程程序的原子性运行时分析
4. Using the VI architecture to build distributed, multithreaded runtime systems [C] . L. Bouge, J.-F. Mehaut, R. Namyst, ACM symposium on Applied computing . 2000

机译：使用VI体系结构构建分布式多线程运行时系统
5. PyDac: A distributed runtime system and programming model for a heterogeneous many-core architecture. [D] . Huang, Bin. 2014

机译：PyDac：用于异构多核体系结构的分布式运行时系统和编程模型。
6. Runtime and Architecture Support for Efficient Data Exchange in Multi-Accelerator Applications [O] . Javier Cabezas, Isaac Gelado, John E. Stone, -1

机译：在多加速器应用程序中进行有效数据交换的运行时和体系结构支持
7. A Runtime System for Programming Out-of-Core Matrix Algorithms-by-Tiles on Multithreaded Architectures [O] . Quintana Ortí Gregorio, Igual Peña Francisco Daniel, Marqués Andrés Mercedes, 2012

机译：一种运行时系统，用于在多线程体系结构上逐块编程核外矩阵算法

A Runtime System for Programming Out-of-Core Matrix Algorithms-by-Tiles on Multithreaded Architectures

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅