首页> 外文会议>International Workshop on Computational Engineering >Portable Node-Level Performance Optimization for the Fast Multipole Method

【24h】

Portable Node-Level Performance Optimization for the Fast Multipole Method

机译：快速多极方法的便携式节点级性能优化

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This article provides an in-depth analysis and high-level C++ optimization strategies for the most time-consuming kernels of a Fast Multipole Method (FMM). The two main kernels of a Coulomb FMM are formulated to support different hardware features, such as unrolling, vectorization or threading without the need to rewrite the kernels in intrinsics or even assembly. The abstract description of the algorithm automatically allows optimal node-level peak performance on a broad class of available hardware platforms. Most of the presented optimization schemes allow a generic, hence platform-independent description for other kernels as well.

机译：本文提供了深入的分析和高级C ++优化策略，用于快速多极方法（FMM）的最耗时的核心。配方化库仑FMM的两个主内核以支持不同的硬件功能，例如展开，矢量化或穿线，而无需重写内在甚至组装中的内核。算法的抽象描述自动允许在广泛的可用硬件平台上最佳节点级峰值性能。大多数所呈现的优化方案允许通用，因此对其他内核的平台无关的描述。

著录项

来源
《International Workshop on Computational Engineering 》|2015年||共18页
会议地点
作者
Andreas Beckmann; Ivo Kabadshow;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术 ;
关键词

相似文献

外文文献
中文文献
专利

1. Optimizing the multipole-to-local operator in the fast multipole method for graphical processing units [J] . Takahashi T., Cecka C., Fong W., International Journal for Numerical Methods in Engineering . 2012 ,第1期

机译：使用图形处理单元的快速多极子方法优化多极子到本地算子
2. High performance BLAS formulation of the multipole-to-local operator in the fast multipole method [J] . Coulaud O, Fortin P, Roman J Journal of Computational Physics . 2008 ,第3期

机译：快速多极点方法中的多极点到本地算子的高性能BLAS公式
3. Scalable and portable implementation of the fast multipole method on parallel computers [J] . Shuji Ogata, Timothy J. Campbell, Rajiv K. Kalia, Computer physics communications . 2003 ,第3期

机译：快速多极点方法在并行计算机上的可扩展和可移植实现
4. Portable Node-Level Performance Optimization for the Fast Multipole Method [C] . Andreas Beckmann, Ivo Kabadshow International Workshop on Computational Engineering . 2015

机译：快速多极方法的便携式节点级性能优化
5. Fast transforms based on structured matrices with applications to the fast multipole method. [D] . Tang, Zhihui. 2004

机译：基于结构化矩阵的快速变换及其在快速多极点方法中的应用。
6. Comparative performance of the finite element method and the boundary element fast multipole method for problems mimicking transcranial magnetic stimulation (TMS) [O] . Aung Thu Htet, Guilherme B Saturnino, Edward H Burnham, -1

机译：有限元法和边界元快速多极法在模拟经颅磁刺激（TMS）问题中的比较性能
7. High performance BLAS formulation of the multipole-to-local operator in the fast multipole method [O] . O. Coulaud, P. Fortin, J. Roman 2008

机译：高性能BLA在快速多极法中的多极到本地操作员的制定

Portable Node-Level Performance Optimization for the Fast Multipole Method

摘要

著录项

相似文献

相关主题

期刊订阅