首页> 外文期刊>Operating systems review >An Evaluation of Memory Consistency Models for Shared-Memory Systems with ILP Processors
【24h】

An Evaluation of Memory Consistency Models for Shared-Memory Systems with ILP Processors

机译:具有ILP处理器的共享内存系统的内存一致性模型评估

获取原文
获取原文并翻译 | 示例
           

摘要

Relaxed consistency models have been shown to significantly outperform sequential consistency for single-issue, statically scheduled processors with blocking reads. However, current microprocessors aggressively exploit instruction-level parallelism (ILP) using methods such as multiple issue, dynamic scheduling, and non-blocking reads. Researchers have conjectured that two techniques, hardware-controlled non-binding prefetching and speculative loads, have the potential to equalize the hardware performance of memory consistency models on such processors. This paper performs the first detailed quantitative comparison of several implementations of sequential consistency and release consistency optimized for aggressive ILP processors. Our results indicate that hardware prefetching and speculative loads dramatically improve the performance of sequential consistency. However, the gap between sequential consistency and release consistency depends on the cache write policy and the complexity of the cache-coherence protocol implementation. In most cases, release consistency significantly outperforms sequential consistency, but for two applications, the use of a write-back primary cache and a more complex cache-coherence protocol nearly equalizes the performance of the two models. We also observe that the existing techniques, which require on-chip hardware modifications, enhance the performance of release consistency only to a small extent. We propose two new software techniques - fuzzy acquires and selective acquires - to achieve more overlap than allowed by the previous implementations of release consistency. To enhance methods for overlapping acquires, we also propose a technique to eliminate control dependences caused by an acquire loop, using a small amount of off-chip hardware called the synchronization buffer.
机译:对于具有阻塞读取的单问题静态调度处理器,松弛一致性模型已显示出明显优于顺序一致性。但是,当前的微处理器使用诸如多重发布,动态调度和非阻塞读取之类的方法积极地利用指令级并行性(ILP)。研究人员推测,硬件控制的非绑定预取和推测性负载这两种技术有可能使此类处理器上的内存一致性模型的硬件性能均等。本文对针对激进的ILP处理器优化的顺序一致性和发行一致性的几种实现方式进行了首次详细的定量比较。我们的结果表明,硬件预取和推测性负载显着提高了顺序一致性的性能。但是,顺序一致性和释放一致性之间的差距取决于高速缓存写入策略和高速缓存一致性协议实现的复杂性。在大多数情况下,版本一致性明显优于顺序一致性,但是对于两个应用程序,使用回写式主缓存和更复杂的缓存一致性协议几乎可以使两个模型的性能均等。我们还观察到,需要对片上硬件进行修改的现有技术仅在很小的程度上提高了发布一致性的性能。我们提出了两种新的软件技术-模糊获取和选择性获取-以实现比以前版本一致性实施所允许的更多重叠。为了增强重叠采集的方法,我们还提出了一种使用少量的称为同步缓冲区的片外硬件来消除由采集循环引起的控制依赖性的技术。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号