...
【24h】

A Compiler Framework for Extracting Superword Level Parallelism

机译:提取超单词级并行性的编译器框架

获取原文
获取原文并翻译 | 示例

摘要

SIMD (single-instruction multiple-data) instruction set extensions are quite common today in both high performance and embedded microprocessors, and enable the exploitation of a specific type of data parallelism called SLP (Superword Level Parallelism). While prior research shows that significant performance savings are possible when SLP is exploited, placing SIMD instructions in an application code manually can be very difficult and error prone. In this paper, we propose a novel automated compiler framework for improving superword level parallelism exploitation. The key part of our framework consists of two stages: superword statement generation and data layout optimization. The first stage is our main contribution and has two phases, statement grouping and statement scheduling, of which the primary goals are to increase SIMD parallelism and, more importantly, capture more superword reuses among the superword statements through global data access and reuse pattern analysis. Further, as a complementary optimization, our data layout optimization organizes data in memory space such that the price of memory operations for SLP is minimized. The results from our compiler implementation and tests on two systems indicate performance improvements as high as 15.2% over a state-of-the-art SLP optimization algorithm.
机译:SIMD(单指令多数据)指令集扩展在高性能和嵌入式微处理器中非常普遍,并且可以利用一种称为SLP(超字级并行)的特殊类型的数据并行性。尽管先前的研究表明,利用SLP可以显着节省性能,但手动将SIMD指令放置在应用程序代码中可能非常困难且容易出错。在本文中,我们提出了一种新颖的自动编译器框架,用于改善超字级并行性开发。我们框架的关键部分包括两个阶段:超字语句生成和数据布局优化。第一阶段是我们的主要贡献,它分为两个阶段,语句分组和语句调度,其主要目标是提高SIMD并行性,更重要的是,通过全局数据访问和重用模式分析,在超字语句中捕获更多的超字重用。此外,作为补充优化,我们的数据布局优化将数据存储在内存空间中,从而使SLP的内存操作成本降至最低。我们的编译器实施和在两个系统上进行测试的结果表明,与最新的SLP优化算法相比,性能提高了15.2%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号