首页> 外文会议> >On the Limitations of Compilers to Exploit Thread-Level Parallelism in Embedded Applications
【24h】

On the Limitations of Compilers to Exploit Thread-Level Parallelism in Embedded Applications

机译:编译器在嵌入式应用程序中利用线程级并行性的局限性

获取原文

摘要

With the growing acceptance of multi-core architectures by the industry, devising novel techniques to extract thread-level parallelism from sequential programs has become a fundamental need. The role of compiler along with programming model and architectural innovation is of utmost importance to fully realize the potential performance benefits of the multi-core architectures. This paper evaluates the capabilities and limitations of parallelizing compilers to extract parallelism automatically from the loops present in sequential programs. The applications from embedded benchmark suites EEMBC 1.1 and MiBench are analyzed using the Intel C++ 9.1 Compiler for Linux. The contributions of the paper are manifold: Firstly, the paper shows that on average 10% of the loops can be parallelized automatically by the Intel Compiler. Secondly, we have shown that the auto-parallelizable loops cover only about 12.5% of the total program execution-time. Thirdly, we have explored the reasons behind the inability of the compiler to auto-parallelize the majority of the loops. We have found that on average 37.5% and 8% of the loops can''t be auto-parallelized because of statically unknown loop trip count and probable data dependence, respectively. Finally, this study identifies the set of loops which comprises the most of the execution time of the programs and shows that compiler, on average, can automatically parallelize about 22% of such loops.
机译:随着业界对多核体系结构的接受程度越来越高,设计新颖的技术以从顺序程序中提取线程级并行性已成为一项基本需求。编译器的作用以及编程模型和体系结构的创新对于完全实现多核体系结构的潜在性能优势至关重要。本文评估了并行化编译器从顺序程序中存在的循环中自动提取并行性的能力和局限性。使用适用于Linux的英特尔C ++ 9.1编译器分析了嵌入式基准测试套件EEMBC 1.1和MiBench中的应用程序。该论文的贡献是多方面的:首先,该论文表明,英特尔编译器平均可以自动并行处理10%的循环。其次,我们已经表明,可自动并行化的循环仅占程序总执行时间的约12.5%。第三,我们探讨了编译器无法自动并行化大多数循环的原因。我们发现,平均而言,分别有37.5%和8%的环路无法自动并行化,这分别是因为静态未知的环路行程计数和可能的数据依赖性。最后,这项研究确定了包含程序大部分执行时间的一组循环,并显示出编译器平均可以自动并行处理约22%的此类循环。

著录项

  • 来源
    《》|2007年|60-66|共7页
  • 会议地点
  • 作者

    Islam; M. Mafijul;

  • 作者单位
  • 会议组织
  • 原文格式 PDF
  • 正文语种
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号