首页> 外文会议>2011 25th IEEE International Parallel Distributed Processing Symposium >GLocks: Efficient Support for Highly-Contended Locks in Many-Core CMPs
【24h】

GLocks: Efficient Support for Highly-Contended Locks in Many-Core CMPs

机译:GLocks:对多核CMP中高度竞争的锁的有效支持

获取原文

摘要

Synchronization is of paramount importance to exploit thread-level parallelism on many-core CMPs. In these architectures, synchronization mechanisms usually rely on shared variables to coordinate multithreaded access to shared data structures thus avoiding data dependency conflicts. Lock synchronization is known to be a key limitation to performance and scalability. On the one hand, lock acquisition through busy waiting on shared variables generates additional coherence activity which interferes with applications. On the other hand, lock contention causes serialization which results in performance degradation. This paper proposes and evaluates textit{GLocks}, a hardware-supported implementation for highly-contended locks in the context of many-core CMPs. textit{GLocks} use a token-based message-passing protocol over a dedicated network built on state-of-the-art technology. This approach skips the memory hierarchy to provide a non-intrusive, extremely efficient and fair lock implementation with negligible impact on energy consumption or die area. A comprehensive comparison against the most efficient shared-memory-based lock implementation for a set of micro benchmarks and real applications quantifies the goodness of textit{GLocks}. Performance results show an average reduction of 42% and 14% in execution time, an average reduction of 76% and 23% in network traffic, and also an average reduction of 78% and 28% in energy-delay$^2$ product (ED$^2$P) metric for the full CMP for the micro benchmarks and the real applications, respectively. In light of our performance results, we can conclude that textit{GLocks} satisfy our initial working hypothesis. textit{GLocks} minimize cache-coherence network traffic due to lock synchronization which translates into reduced power consumption and execution time.
机译:同步对于在多核CMP上利用线程级并行性至关重要。在这些体系结构中,同步机制通常依赖于共享变量来协调对共享数据结构的多线程访问,从而避免了数据依赖性冲突。已知锁同步是性能和可伸缩性的关键限制。一方面,通过忙于共享变量的等待来获取锁会产生额外的一致性活动,从而干扰应用程序。另一方面,锁争用会导致序列化,从而导致性能下降。本文提出并评估了textit {GLocks},这是在多核CMP的上下文中硬件支持的高度竞争锁的实现。 textit {GLocks}在基于最新技术的专用网络上使用基于令牌的消息传递协议。这种方法跳过了内存层次结构,从而提供了一种非介入式,极其高效且公平的锁实现方式,对能耗或芯片面积的影响可忽略不计。针对一组微型基准测试和实际应用程序,与最有效的基于共享内存的锁实现进行了全面比较,从而量化了textit {GLocks}的优势。性能结果显示,执行时间平均减少了42%和14%,网络流量平均减少了76%和23%,能源延迟2美元产品的平均减少了78%和28%( ED $ ^ 2 $ P)指标分别用于微基准测试和实际应用的完整CMP。根据我们的性能结果,我们可以得出结论,textit {GLocks}满足了我们最初的工作假设。由于锁定同步,textit {GLocks}最大限度地减少了缓存一致性网络流量,从而降低了功耗和执行时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号