Leveraging PaRSEC Runtime Support to Tackle Challenging 3D Data-Sparse Matrix Problems

机译：利用PARSEC运行时支持来解决具有挑战性的3D数据稀疏矩阵问题

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The task-based programming model associated with dynamic runtime systems has gained popularity for challenging problems because of workload imbalance, heterogeneous resources, or extreme concurrency. During the last decade, low-rank matrix approximations—where the main idea consists of exploiting data sparsity, typically by compressing off-diagonal tiles up to an application-specific accuracy threshold—have been adopted to address the curse of dimensionality at extreme scale. In this paper, we create a bridge between the runtime and the linear algebra by communicating knowledge of the data sparsity to the runtime. We design and implement this synergistic approach with high user productivity in mind, in the context of the PaRSEC runtime system and the HiCMA numerical library. This requires extending PaRSEC with new features to integrate rank information into the dataflow so that proper decisions can be made at runtime. We focus on the tile low-rank (TLR) Cholesky factorization for solving 3D data-sparse covariance matrix problems arising in environmental applications. In particular, we employ the 3D exponential model of the Mateŕn matrix kernel, which exhibits challenging nonuniform high ranks in off-diagonal tiles. We first provide dynamic data structure management driven by a performance model to reduce extra floating-point operations. Next, we optimize the memory footprint of the application by relying on a dynamic memory allocator, and supported by a rank-aware data distribution to cope with the workload imbalance. Finally, we expose further parallelism using kernel recursive formulations to shorten the critical path. Our resulting high-performance implementation outperforms existing data-sparse TLR Cholesky factorization by up to 7-fold on a large-scale distributed-memory system, while minimizing the memory footprint up to a 44-fold factor. This multidisciplinary work highlights the need to empower runtime systems beyond their original duty of task scheduling for servicing next-generation low-rank matrix algebra libraries.

机译：由于工作负载不平衡，异构资源或极端并发，与动态运行时系统相关的基于任务的编程模型对挑战性问题产生了普及。在过去十年中，低秩矩阵近似 - 其中主要思想包括利用数据稀疏性，通常通过压缩偏离对角线块，以便在极度范围内地解决维度的维度诅咒。在本文中，我们通过将数据稀疏性的知识传送到运行时，在运行时和线性代数之间创建一个桥梁。我们在Parsec运行时系统和HICMA数值库的上下文中设计和实施具有高用户生产力的协同方法和实现这种协同方法。这需要将Parsec扩展为新功能，将秩信息集成到数据流中，以便在运行时可以进行正确的决策。我们专注于瓷砖低级（TLR）弦孔因分解，以解决环境应用中出现的3D数据稀疏协方差矩阵问题。特别是，我们采用Mateŕn矩阵内核的3D指数模型，其展示了挑战的非均匀高排位数在偏差瓦片中。我们首先提供由性能模型驱动的动态数据结构管理，以减少额外的浮点操作。接下来，我们通过依赖于动态内存分配器来优化应用程序的内存占用空间，并由秩的数据分发支持以应对工作负载不平衡。最后，我们使用核递归制剂缩短临界路径的进一步并行性。我们所产生的高性能实现优于现有的数据稀疏的TLR孔孔孔，在大型分布式存储系统上最多7倍，同时最大限度地减少了44倍因子的内存占用空间。此多学科工作突出了授权运行时系统，超出其原始任务调度的原始职责，以便为下一代低级矩阵代数库提供服务。

著录项

来源
《IEEE International Parallel and Distributed Processing Symposium》|2021年|79-89|共11页
会议地点
作者
Qinglei Cao; Yu Pei; Kadir Akbudak; George Bosilca; Hatem Ltaief; David Keyes; Jack Dongarra;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Solid modeling; Runtime; Three-dimensional displays; Programming; Dynamic scheduling; Matrices; Libraries;

机译：实体建模;运行时;三维显示;编程;动态调度;矩阵;图书馆;

相似文献

外文文献
中文文献
专利

1. HRC: A 3D NoC Architecture with Genuine Support for Runtime Thermal-Aware Task Management [J] . Xiaohang Wang, Yingtao Jiang, Mei Yang, IEEE Transactions on Computers . 2017,第10期

机译：HRC：具有对运行时热感知任务管理的真正支持的3D NoC架构
2. Handling pulp and paper stock transfer: paper production is one of the most challenging services a pump can tackle, here's how to tackle it [J] . World Pumps . 1999,第390期

机译：处理纸浆和纸浆的转移：纸浆生产是泵可以解决的最具挑战性的服务之一，以下是解决方法
3. Data-Sparse Matrix Decomposition Algorithm for Mixed Finite-Element Time-Domain Solution of Maxwell Equations [J] . Wan Ting, Du Lei, Hong Tao, Electromagnetics . 2015,第5a8期

机译：麦克斯韦方程组有限元时域混合求解的数据稀疏矩阵分解算法
4. Communication Avoiding 2D Stencil Implementations over PaRSEC Task-Based Runtime [C] . Yu Pei, Qinglei Cao, George Bosilca, IEEE International Parallel and Distributed Processing Symposium Workshops . 2020

机译：通过基于任务的运行时PaRSEC避免2D模具实现的通信
5. Leveraging UAS for 3D Orthomosaic Aircraft Images to Support Maintenance Activities [D] . Weldon, William. 2017

机译：利用UAS获取3D正交飞机图像以支持维护活动
6. Programming and Runtime Support to Blaze FPGA Accelerator Deployment at Datacenter Scale [O] . Muhuan Huang, Di Wu, Cody Hao Yu, -1

机译：数据中心规模的Blaze FPGA加速器部署的编程和运行时支持
7. Leveraging PaRSEC Runtime Support to Tackle Challenging 3D Data-Sparse Matrix Problems [O] . Qinglei Cao, Yu Pei, Kadir Akbudak, 2021

机译：利用PARSEC运行时支持解决挑战3D数据稀疏矩阵问题

Leveraging PaRSEC Runtime Support to Tackle Challenging 3D Data-Sparse Matrix Problems

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅