【24h】

Timestamp snooping

机译:时间戳监听

获取原文

摘要

Symmetric muultiprocessor (SMP) servers provide superior performance for the commercial workloads that dominate the Internet. Our simulation results show that over one-third of cache misses by these applications result in cache-to-cache transfers, where the data is found in another processor's cache rather than in memory. SMPs are optimized for this case by using snooping protocols that broadcast address transactions to all processors. Conversely, directory-based shared-memory systems must indirectly locate the owner and sharers through a directory, resulting in larger average miss latencies.This paper proposes timestamp snooping, a technique that allows SMPs to i) utilize high-speed switched interconnection networks and ii) exploit physical locality by delivering address transactions to processors and memories without regard to order. Traditional snooping requires physical ordering of transactions. Timestamp snooping works by processing address transactions in a logical order. Logical time is maintained by adding a few bits per address transaction and having network switches perform a handshake to ensure on-time delivery. Processors and memories then reorder transactions based on their timestamps to establish a total order.We evaluate timestamp snooping with commercial workloads on a 16-processor SPARC system using the Simics full-system simulator. We simulate both an indirect (butterfly) and a direct (torus) network design. For OLTP, DSS, web serving, web searching, and one scientific application, timestamp snooping with the butterfly network runs 6-28% faster than directories, at a cost of 13-43% more link traffic. Similarly, with the torus network, timestamp snooping runs 6-29% faster for 17-37% more link traffic. Thus, timestamp snooping is worth considering when buying more interconnect bandwidth is easier than reducing interconnect latency.
机译:对称多处理器(SMP)服务器为主导Internet的商业工作负载提供了卓越的性能。我们的仿真结果表明,这些应用程序三分之一以上的高速缓存未命中会导致高速缓存到高速缓存的传输,其中数据是在另一个处理器的高速缓存中而不是在内存中找到的。通过使用向所有处理器广播地址事务的侦听协议,针对这种情况对SMP进行了优化。相反,基于目录的共享内存系统必须通过目录间接定位所有者和共享者,从而导致更大的平均未命中延迟。本文提出了时间戳侦听,该技术使SMP能够:i)利用高速交换互连网络和ii )通过将地址事务传递到处理器和存储器而无需考虑顺序,从而充分利用物理位置。传统的监听需要对事务进行物理排序。时间戳监听通过按逻辑顺序处理地址事务来工作。通过为每个地址事务添加几位并让网络交换机执行握手以确保按时交付,来维持逻辑时间。然后,处理器和内存将根据其时间戳对事务进行重新排序以建立总订单。我们使用Simics完整系统模拟器在16处理器SPARC系统上评估带有商业工作负载的时间戳监听。我们同时模拟了间接(蝴蝶)和直接(torus)网络设计。对于OLTP,DSS,Web服务,Web搜索和一种科学应用程序,蝶形网络的时间戳侦听速度比目录快6-28%,而链接流量却增加13-43%。同样,对于环型网络,时间戳侦听速度提高了6-29%,链接流量增加了17-37%。因此,在购买更多的互连带宽比减少互连等待时间更容易时,值得考虑使用时间戳监听。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号