首页> 外文会议>Conference on Facing the Multicore-Challenge >Towards High-Performance Implementations of a Custom HPC Kernel Using Intel Array Building Blocks
【24h】

Towards High-Performance Implementations of a Custom HPC Kernel Using Intel Array Building Blocks

机译:使用英特尔阵列构建块对自定义HPC内核的高性能实现

获取原文

摘要

Today's highly parallel machines drive a new demand for parallel programming. Fixed power envelopes, increasing problem sizes, and new algorithms pose challenging targets for developers. HPC applications must leverage SIMD units, multi-core architectures, and heterogeneous computing platforms for optimal performance. This leads to low-level, non-portable code that is difficult to write and maintain. With Intel Array Building Blocks (Intel ArBB), programmers focus on the high-level algorithms and rely on an automatic parallelization and vectorization with strong safety guarantees. Intel ArBB hides vendor-specific hardware knowledge by runtime just-in-time (JIT) compilation. This case study on data mining with adaptive sparse grids unveils how deterministic parallelism, safety, and runtime optimization make Intel ArBB practically applicable. Hand-tuned code is about 40% faster than ArBB, but needs about 8x more code. ArBB clearly outperforms standard semi-automatically parallelized C/C++ code by approximately 6x.
机译:今天的高度平行机器推动了对并行编程的新需求。固定电源信封,增加问题尺寸和新算法为开发人员构成挑战目标。 HPC应用程序必须利用SIMD单元,多核架构和异构计算平台,以实现最佳性能。这导致了难以编写和维护的低级,不便携的代码。使用英特尔阵列构建块(Intel Arbb),程序员专注于高级算法,并依靠自动并行化和矢量化,具有强烈的安全保证。 Intel Arbb通过运行时立即(JIT)编译隐藏特定于供应商的硬件知识。这种案例研究与自适应稀疏网格的数据挖掘推出了确定性的并行,安全性和运行时优化如何使Intel Arbb实际上适用。手工调整代码比ARBB快约40%,但需要大约8倍的代码。 Arbb明显优于标准半自动并行化C / C ++代码约6倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号