首页> 外文期刊>International Journal of Electrical Power & Energy Systems >GPU-accelerated sparse matrices parallel inversion algorithm for large-scale power systems
【24h】

GPU-accelerated sparse matrices parallel inversion algorithm for large-scale power systems

机译:适用于大规模电力系统的GPU加速的稀疏矩阵并行反演算法

获取原文
获取原文并翻译 | 示例
           

摘要

State-of-the-art Graphics Processing Unit (CPU) has superior performances on float-pointing calculation and memory bandwidth, and therefore has great potential in many computationally intensive power system applications, one of which is the inversion of large-scale sparse matrix. It is a fundamental component for many power system analyses which requires to solve massive number of forward and backward substitution (F&B) subtasks and seems to be a good GPU-accelerated candidate application. By means of solving multiple F&B subtasks concurrently and a serial of performance tunings in compliance with GPU's architectures, we successfully develop a batch F&B algorithm on GPUs, which not only extracts the intra-level and intra-level parallelisms inside single F&B subtask but also explores a more regular parallelism among massive F&B subtasks, called inter-task parallelism. Case study on a 9241-dimension case shows that the proposed batch F&B solver consumes 2.92 mu s per forward substitution (FS) subtask when the batch size is equal to 3072, achieving 65 times speedup relative to KLU library. And on the basis the complete design process of GPU-based inversion algorithm is proposed. By offloading the tremendous computational burden to GPU, the inversion of 9241-dimension case consumes only 97 ms, which can achieve 8.1 times speedup relative to the 12-core CPU inversion solver based on KLU library. The proposed batch F&B solver is practically very promising in many other power system applications requiring solving massive F&B subtasks, such as probabilistic power flow analysis.
机译:最先进的图形处理单元(CPU)在浮点计算和内存带宽方面具有卓越的性能,因此在许多计算密集型电力系统应用中具有巨大潜力,其中之一是大规模稀疏矩阵的求逆。它是许多电源系统分析的基本组件,需要解决大量的前向和后向替换(F&B)子任务,并且似乎是GPU加速的良好候选应用程序。通过同时解决多个F&B子任务以及一系列符合GPU体系结构的性能调整,我们成功地在GPU上开发了批处理F&B算法,该算法不仅提取单个F&B子任务内部的内部和内部并行性,而且还探索了大型F&B子任务之间更规则的并行性,称为任务间并行性。对9241维案例的案例研究表明,当批处理大小等于3072时,建议的批处理F&B求解器每个前向替换(FS)子任务消耗2.92μs,相对于KLU库,实现了65倍的加速。在此基础上,提出了基于GPU的反演算法的完整设计过程。通过将巨大的计算负担转移给GPU,9241维案例的反转仅消耗97毫秒,相对于基于KLU库的12核CPU反转求解器而言,它可以实现8.1倍的加速。在许多其他需要解决大量F&B子任务(例如概率潮流分析)的电力系统应用中,拟议的F&B求解器在实践中非常有前途。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号