...
首页> 外文期刊>Journal of Parallel and Distributed Computing >Synchronization coherence: A transparent hardware mechanism for cache coherence and fine-grained synchronization
【24h】

Synchronization coherence: A transparent hardware mechanism for cache coherence and fine-grained synchronization

机译:同步一致性:用于高速缓存一致性和细粒度同步的透明硬件机制

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

The quest to improve performance forces designers to explore finer-grained multiprocessor machines. Ever increasing chip densities based on CMOS improvements fuel research in highly parallel chip multiprocessors with 100s of processing elements. With such increasing levels of parallelism, synchronization is set to become a major performance bottleneck and efficient support for synchronization an important design criterion. Previous research has shown that integrating support for fine-grained synchronization can have significant performance benefits compared to traditional coarse-grained synchronization. Not much progress has been made in supporting fine-grained synchronization transparently to processor nodes: a key reason perhaps why wide adoption has not followed. In this paper, we propose a novel approach called synchronization coherence that can provide transparent fine-grained synchronization and caching in a multiprocessor machine and single-chip multiprocessor. Our approach merges fine-grained synchronization mechanisms with traditional cache coherence protocols. It reduces network utilization as well as synchronization related processing overheads while adding minimal hardware complexity as compared to cache coherence mechanisms or previously reported fine-grained synchronization techniques. In addition to its benefit of making synchronization transparent to processor nodes, for the applications studied, it provides up to 23% improvement in performance and up to 24% improvement in energy efficiency with no L2 caches compared to previous fine-grained synchronization techniques. The performance improvement increases up to 38% when simulating with an ideal L2 cache system.
机译:为了提高性能,设计师不得不探索更细粒度的多处理器机器。基于CMOS改进的不断增加的芯片密度推动了具有100多个处理元件的高度并行芯片多处理器的研究。随着并行度的不断提高,同步将成为主要的性能瓶颈,而对同步的有效支持则成为重要的设计准则。先前的研究表明,与传统的粗粒度同步相比,集成对细粒度同步的支持可以显着提高性能。在透明地支持处理器节点的细粒度同步方面,并没有取得太大进展:这也许是为什么未广泛采用的一个关键原因。在本文中,我们提出了一种称为同步一致性的新颖方法,该方法可以在多处理器机器和单芯片多处理器中提供透明的细粒度同步和缓存。我们的方法将细粒度的同步机制与传统的缓存一致性协议合并在一起。与缓存一致性机制或先前报告的细粒度同步技术相比,它减少了网络利用率以及与同步相关的处理开销,同时增加了最小的硬件复杂性。除了使同步对于处理器节点透明之外,对于先前研究的应用程序,与以前的细粒度同步技术相比,它在不使用L2缓存的情况下,性能提高了23%,能源效率提高了24%。使用理想的L2缓存系统进行仿真时,性能提高最多可提高38%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号