首页> 外文期刊>Concurrency and Computation >An efficient memory operations optimization technique for vector loops on Itanium 2 processors
【24h】

An efficient memory operations optimization technique for vector loops on Itanium 2 processors

机译:一种用于Itanium 2处理器上向量循环的高效内存操作优化技术

获取原文
获取原文并翻译 | 示例

摘要

To keep up with a large degree of instruction level parallelism (ILP), the Itanium 2 cache systems use a complex organization scheme: load/store queues, banking and interleaving. In this paper, we study the impact of these cache systems on memory instructions scheduling. We demonstrate that, if no care is taken at compile time, the non-precise memory disambiguation mechanism and the banking structure cause severe performance loss, even for very simple regular codes. We also show that grouping the memory operations in a pseudo-vectorized way enables the compiler to generate more effective code for the Itanium 2 processor. The impact of this code optimization technique on register pressure is analyzed for various vectorization schemes.
机译:为了跟上高度的指令级并行性(ILP),Itanium 2高速缓存系统使用复杂的组织方案:加载/存储队列,存储和交织。在本文中,我们研究了这些缓存系统对内存指令调度的影响。我们证明,如果在编译时不加以注意,那么即使对于非常简单的常规代码,非精确的内存歧义消除机制和库结构也会导致严重的性能损失。我们还表明,以伪矢量化方式对内存操作进行分组可使编译器为Itanium 2处理器生成更有效的代码。针对各种矢量化方案,分析了此代码优化技术对寄存器压力的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号