【24h】

A Task Parallel Implementation of Fast Multipole Methods

机译:快速多极方法的任务并行实现

获取原文

摘要

This paper describes a task parallel implementation of ExaFMM, an open source implementation of fast multipole methods (FMM), using a lightweight task parallel library MassiveThreads. Although there have been many attempts on parallelizing FMM, experiences have almost exclusively been limited to formulation based on flat homogeneous parallel loops. FMM in fact contains operations that cannot be readily expressed in such conventional but restrictive models. We show that task parallelism, or parallel recursions in particular, allows us to parallelize all operations of FMM naturally and scalably. Moreover it allows us to parallelize a ``mutual interaction'' for force/potential evaluation, which is roughly twice as efficient as a more conventional, unidirectional force/potential evaluation. The net result is an open source FMM that is clearly among the fastest single node implementations, including those on GPUs; with a million particles on a 32 cores Sandy Bridge 2.20GHz node, it completes a single time step including tree construction and force/potential evaluation in 65 milliseconds. The study clearly showcases both programmability and performance benefits of flexible parallel constructs over more monolithic parallel loops.
机译:本文介绍了EXAFMM的任务并行实现,使用轻量级任务并行库MassivethReads进行快速多极方法(FMM)的开源实现。虽然已经有许多尝试并行化FMM,但是几乎完全仅限于基于平面均匀平行环路的配方。 FMM实际上包含在这种传统但限制模型中不能容易地表达的操作。我们表明任务并行性或特别是并行递归,允许我们自然和可伸缩地并行化FMM的所有操作。此外,它允许我们平行于用于力/潜在评估的“相互相互作用”,这大致两倍为更常规,单向力/潜在评估。网络结果是一个开源FMM,清楚地区是最快的单个节点实现,包括GPU;在32个核心桥梁2.20GHz节点上占有一百万个粒子,它完成了一个时间步长,包括树施工和55毫秒的力/潜在评估。该研究清楚地展示了柔性平行结构的可编程性和性能益处,在更多单片平行环上。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号