【24h】

LiteTM: Reducing Transactional State Overhead

机译:leitm:减少交易状态开销

获取原文

摘要

Transactional memory (TM) has been proposed to address some of the programmability issues of chip multiprocessors. Hardware implementations of transactional memory (HTMs) have made significant progress in providing support for features such as long transactions that spill out of the cache, and context switches, page and thread migration in the middle of transactions. While essential for the adoption of HTMs in real products, supporting these features has resulted in significant state overhead. For instance, TokenTM adds at least 16bits per block in the caches which is significant in absolute terms, and steals 16 of 64 (25%) memory ECC bits per block, weakening error protection. Also, the state bits nearly double the tag array size. These significant and practical concerns may impede the adoption of HTMs, squandering the progress achieved by HTMs. The overhead comes from tracking the thread identifier and the transactional read-sharer count at the L1-block granularity. The thread identifier is used to identify the transaction, if only one, to which an L1-evicted block belongs. The read-sharer count is used to identify conflicts involving multiple readers (i.e., write to a block with non-zero count). To reduce this overhead, we observe that the thread identifiers and read-sharer counts are not needed in a majority of cases. (1) Repeated misses to the same blocks are rare within a transaction (i.e., locality holds). (2) Transactional read-shared blocks that both are evicted from multiple sharers' L1s and are involved in conflicts are rare. Exploiting these observations, we propose a novel HTM, called LiteTM, which completely eliminates the count and identifier and uses software to infer the lost information. Using simulations of the STAMP benchmarks running on 8 cores, we show that LiteTM reduces TokenTM's state overhead by about 87% while performing within 4%, on average, and 10%, in the worst case, of TokenTM.
机译:已经提出了事务内存(TM)来解决芯片多处理器的一些可编程性问题。事务存储器(HTMS)的硬件实现在提供了对诸如泄漏缓存中的长时间的功能等功能方面取得了重大进展,以及事务中间的上下文切换,页面和线程迁移。虽然在Real产品中采用HTMS必不可少,但支持这些功能导致了显着的状态开销。例如,Tokentm在高速缓存中增加了至少16位,在绝对术语中具有重要意义,并且每块窃取64(25%)内存ECC位,误差保护削弱。此外,状态位几乎加倍标签阵列大小。这些重大和实际问题可能会妨碍采用HTM,揭示HTM所取得的进展。开销来自跟踪L1块粒度的线程标识符和事务读写器计数。线程标识符用于标识L1驱逐块所属的仅一个事务,如果只有一个。读写程序计数用于识别涉及多个读取器的冲突(即,写入具有非零计数的块)。为了减少此开销,我们观察到大多数情况下不需要线程标识符和读写器计数。 (1)在交易中重复错过块是罕见的(即,地方性持有)。 (2)从多个共享者L1S驱逐的事务性读取块,并且涉及冲突是罕见的。利用这些观察结果,我们提出了一种名为LITETM的新型HTM,它完全消除了计数和标识符并使​​用软件来推断丢失的信息。利用在8个核心上运行的邮票基准的模拟,我们显示LITETM将Tokentm的状态开销降低约87%,同时在Tokentm的最坏情况下,平均地在4%内执行4%和10%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号