首页> 外文期刊>Parallel Computing >A compiler for exploiting nested parallelism in OpenMP programs
【24h】

A compiler for exploiting nested parallelism in OpenMP programs

机译:在OpenMP程序中利用嵌套并行性的编译器

获取原文
获取原文并翻译 | 示例

摘要

This paper presents the design and implementation of a parallelization framework and OpenMP runtime support in Intel~® C++ & Fortran compilers for exploiting nested parallelism in applications using OpenMP pragmas or directives. We conduct the performance evaluation of two multimedia applications parallelized with OpenMP pragmas and compiled with the Intel C++ compiler on Hyper-Threading Technology (HT) enabled multiprocessor systems. The performance results show that the multithreaded code generated by the Intel compiler achieved a speedup up to 4.69 on 4 processors with HT enabled for five different input video sequences for the H.264 encoder workload, and a 1.28 speedup on an HT enabled single-CPU system and 1.99 speedup on an HT-enabled dual-CPU system for the audio-visual speech recognition workload. The performance gain due to exploiting nested parallelism for leveraging Hyper-Threading Technology is up to 70% for two multimedia workloads under different multiprocessor system configurations. These results demonstrate that hyper-threading benefits can be achieved by exploiting nested parallelism through Intel compiler and runtime system support for OpenMP programs.
机译:本文介绍了Intel〜®C ++和Fortran编译器中并行化框架和OpenMP运行时支持的设计和实现,以使用OpenMP编译指示或指令在应用程序中利用嵌套并行性。我们对与OpenMP实用程序并行并在支持超线程技术(HT)的多处理器系统上使用Intel C ++编译器编译的两个多媒体应用程序进行性能评估。性能结果表明,英特尔编译器生成的多线程代码在启用HT的4个处理器上为H.264编码器工作负载的五个不同输入视频序列实现了高达4.69的加速,在启用HT的单CPU上实现了1.28的加速。启用HT的双CPU系统上的系统和1.99加速,以处理视听语音识别工作负载。对于不同的多处理器系统配置下的两个多媒体工作负载,由于利用嵌套并行机制利用超线程技术而导致的性能提升高达70%。这些结果表明,通过英特尔编译器和对OpenMP程序的运行时系统支持,利用嵌套并行性可以实现超线程优势。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号