【24h】

Vectorizing for Wider Vector Units in a HW/SW Co-designed Environment

机译:在硬件/软件协同设计的环境中进行矢量化以实现更宽的矢量单位

获取原文
获取原文并翻译 | 示例

摘要

SIMD accelerators provide an energy efficient way of improving the computational power in modern microprocessors. Due to their hardware simplicity, these accelerators have evolved in terms of width from 64-bit vectors in Intel's MMX to 512-bit wide vector units in Intel's Xeon Phi. Although SIMD accelerators are simple in terms of hardware design, code generation for them has always been a challenge. This paper explores the scalability of SIMD accelerators from the code generation point of view. We explore the potential problems in vectorization at higher vector lengths. Furthermore, we propose Variable Length Vectorization and Selective Writing in a HW/SW co-designed environment to get around these problems. We evaluate our proposals using a set of SPECFP2006 and Physics bench applications. Our experimental results show an average dynamic instruction elimination of 33% and 40% and an average speed up of 15% and 10% for SPECFP2006 and Physics bench respectively, for 512-bit vector length, over the scalar baseline code.
机译:SIMD加速器提供了一种节能的方式来提高现代微处理器的计算能力。由于它们的硬件简单性,这些加速器的宽度已从Intel MMX中的64位矢量扩展到Intel Xeon Phi中的512位宽矢量单元。尽管SIMD加速器在硬件设计方面很简单,但为其生成代码始终是一个挑战。本文从代码生成的角度探讨了SIMD加速器的可伸缩性。我们探索在更高向量长度的向量化中的潜在问题。此外,我们建议在硬件/软件共同设计的环境中使用可变长度向量化和选择性编写来解决这些问题。我们使用一组SPECFP2006和Physical Bench应用程序评估我们的建议。我们的实验结果表明,对于标量基线代码,对于512位矢量长度,SPECFP2006和Physics Bench的平均动态指令消除率为33%和40%,平均速度分别为15%和10%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号