首页> 外文会议>AIAA computational fluid dynamics conference;AIAA aviation forum >Optimizing performance of combustion chemistry solvers on Intel's Many Integrated Core (MIC) architectures
【24h】

Optimizing performance of combustion chemistry solvers on Intel's Many Integrated Core (MIC) architectures

机译:在英特尔的众多集成核心(MIC)架构上优化燃烧化学求解器的性能

获取原文

摘要

This work investigates novel algorithm designs and optimization techniques for restructuring chemistry integrators in zero and multidimensional combustion solvers, which can then be effectively used on the emerging generation of Intel's Many Integrated Core/Xeon Phi processors. These processors offer increased computing performance via large number of lightweight cores at relatively lower clock speeds compared to traditional processors (e.g. Intel Sandybridge/Ivybridge) used in current supercomputers. This style of processor can be productively used for chemistry integrators that form a costly part of computational combustion codes, in spite of their relatively lower clock speeds. Performance commensurate with traditional processors is achieved here through the combination of careful memory layout, exposing multiple levels of fine grain parallelism and through extensive use of vendor supported libraries (Cilk Plus and Math Kernel Libraries). Important optimization techniques for efficient memory usage and vectorization have been identified and quantified. These optimizations resulted in a factor of ~ 3 speed-up using Intel 2013 compiler and ~ 1.5 using Intel 2017 compiler for large chemical mechanisms compared to the unop-timized version on the Intel Xeon Phi. These strategies, especially with respect to memory usage and vectorization, should also be beneficial for general purpose computational fluid dynamics codes.
机译:这项工作研究了用于在零维和多维燃烧求解器中重组化学积分器的新颖算法设计和优化技术,这些算法然后可以有效地用于新一代的Intel的Many Integrated Core / Xeon Phi处理器。与当前超级计算机中使用的传统处理器(例如Intel Sandybridge / Ivybridge)相比,这些处理器通过大量轻量级内核以相对较低的时钟速度提供了更高的计算性能。尽管这种处理器的时钟速度相对较低,但它们仍可以有效地用于构成化学计算代码昂贵部分的化学积分器。通过结合精心的内存布局,暴露出多个级别的细粒度并行性以及广泛使用供应商支持的库(Cilk Plus和Math Kernel库),可以实现与传统处理器相当的性能。有效的内存使用和向量化的重要优化技术已被识别和量化。与Intel Xeon Phi上未经优化的版本相比,这些优化导致使用Intel 2013编译器的速度提高了约3倍,使用Intel 2017编译器的速度提高了约1.5倍。这些策略,特别是在内存使用和矢量化方面,对于通用计算流体动力学代码也应是有益的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号