首页> 外文会议>IEEE International Parallel and Distributed Processing Symposium >Optimization of an Electromagnetics Code with Multicore Wavefront Diamond Blocking and Multi-dimensional Intra-Tile Parallelization
【24h】

Optimization of an Electromagnetics Code with Multicore Wavefront Diamond Blocking and Multi-dimensional Intra-Tile Parallelization

机译:使用多核波前菱形阻塞和多维内部砖平行化的电磁码优化

获取原文

摘要

Understanding and optimizing the properties of solar cells is becoming a key issue in the search for alternatives to nuclear and fossil energy sources. A theoretical analysis via numerical simulations involves solving Maxwell's Equations in discretized form and typically requires substantial computing effort. We start from a hybrid-parallel (MPI+OpenMP) production code that implements the Time Harmonic Inverse Iteration Method (THIIM) with Finite-Difference Frequency Domain (FDFD) discretization. Although this algorithm has the characteristics of a strongly bandwidth-bound stencil update scheme, it is significantly different from the popular stencil types that have been exhaustively studied in the high performance computing literature to date. We apply a recently developed stencil optimization technique, multicore wavefront diamond tiling with multi-dimensional cache block sharing, and describe in detail the peculiarities that need to be considered due to the special stencil structure. Concurrency in updating the components of the electric and magnetic fields provides an additional level of parallelism. The dependence of the cache size requirement of the optimized code on the blocking parameters is modeled accurately, and an auto-tuner searches for optimal configurations in the remaining parameter space. We were able to completely decouple the execution from the memory bandwidth bottleneck, accelerating the implementation by a factor of three to four compared to an optimal implementation with pure spatial blocking on an 18-core Intel Haswell CPU.
机译:理解和优化太阳能电池的性质正在成为寻找核和化石能源替代品的关键问题。通过数值模拟的理论分析涉及以离散形式解决麦克斯韦方程,并且通常需要大量的计算工作。我们从一种混合并行(MPI + OpenMP)生产代码开始,实现具有有限差分频率域(FDFD)离散化的时间谐波反迭代方法(Thiim)。虽然该算法具有强带宽的模板更新方案的特性,但是与迄今为止在高性能计算文献中被详尽地研究的流行模板类型有显着不同。我们应用了最近开发的模板优化技术,多核波前钻石平铺,具有多维高速缓存块共享,并详细描述了由于特殊的模版结构而需要考虑的特性。更新电源和磁场组件时的并发性提供了额外的平行度。高速缓存大小对拦截参数上的优化代码要求的依赖性被准确地建模,并且在剩余参数空间中进行自动调谐器搜索最佳配置。我们能够将执行从内存带宽瓶颈彻底分离,与在18核Intel Haswell CPU上的纯空间阻塞的最佳实现相比,将实现加速三到四倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号