【24h】

Optimizing compiler for shared-memory multiple SIMD architecture

机译:针对共享内存多SIMD架构优化编译器

获取原文
获取原文并翻译 | 示例

摘要

With the rapid growth of multimedia and game, these applications put more and more pressure on the processing ability of modern processors. Multiple SIMD architecture is widely used in multimedia processing field as a multimedia accelerator.With the consideration of power consumption and chip size, shared memory multiple SIMD architecture is mainly used in embedded SOCs. In order to further fit mobile environment, there is the constraint of limited register number as well. Although shared memory multiple SIMD architecture simplify the chip design, these constraints are the major obstacles to map the real multimedia applications to these architectures. Until now, to our best knowledge, there is little research on the optimizing techniques for shared memory multiple SIMD architecture.In this paper, we present a compiler framework, which aims at automatically generating high performance codes for shared memory multiple SIMD architecture. In this framework, we reduce the competition of shared data busthrough increasing the register locality, improve the utilization of data bus by read-only data vector replication and solve the problem of limited register number through a resource allocation algorithm. The framework also handlers the issues concerning on data transformation. As the experimental results shown, this framework is successful in mapping real multimedia applications to shared memory multiple SIMD architecture. It leads to an average speedup by a factor of 3.19 and an average utilization of SM-SIMD architecture with 8 SIMD units by a factor of 52.6%.
机译:随着多媒体和游戏的迅速发展,这些应用对现代处理器的处理能力施加了越来越大的压力。多重SIMD架构作为多媒体加速器在多媒体处理领域得到了广泛的应用。考虑到功耗和芯片尺寸,共享内存多重SIMD架构主要用于嵌入式SOC。为了进一步适应移动环境,还存在受限的寄存器数量的约束。尽管共享内存的多个SIMD架构简化了芯片设计,但是这些限制是将实际的多媒体应用程序映射到这些架构的主要障碍。到目前为止,就我们所知,对共享内存多SIMD体系结构的优化技术的研究很少。本文提出了一种编译器框架,旨在自动为共享内存多SIMD体系结构生成高性能代码。在这种框架下,我们通过增加寄存器的位置来减少共享数据总线的竞争,通过只读数据矢量复制提高数据总线的利用率,并通过资源分配算法解决寄存器数量有限的问题。该框架还处理有关数据转换的问题。如实验结果所示,该框架成功地将实际的多媒体应用程序映射到共享内存的多个SIMD体系结构。它将平均速度提高了3.19倍,将具有8个SIMD单元的SM-SIMD架构的平均利用率提高了52.6%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号