首页> 外文期刊>Computer architecture news >Row-Buffer Decoupling: A Case for Low-Latency DRAM Microarchitecture
【24h】

Row-Buffer Decoupling: A Case for Low-Latency DRAM Microarchitecture

机译:行缓冲区解耦:低延迟DRAM微体系结构的一种情况

获取原文
获取原文并翻译 | 示例

摘要

Modern DRAM devices for the main memory are structured to have multiple banks to satisfy ever-increasing throughput, energy-efficiency, and capacity demands. Due to tight cost constraints, only one row can be buffered (opened) per bank and actively service requests at a time, while the row must be deactivated (closed) before a new row is stored into the row buffers. Hasty deactivation unnecessarily re-opens rows for otherwise row-buffer hits while hindsight accompanies the deactivation process on the critical path of accessing data for row-buffer misses. The time to (de)activate a row is comparable to the time to read an open row while applications are often sensitive to DRAM latency. Hence, it is critical to make the right decision on when to close a row. However, the increasing number of banks per DRAM device over generations reduces the number of requests per bank. This forces a memory controller to frequently predict when to close a row due to a lack of information on future requests, while the dynamic nature of memory access patterns limits the prediction accuracy. In this paper, we propose a novel DRAM microarchitecture that can eliminate the need for any prediction. First, we identify that precharging the bitlines dominates the deactivate time, while sense amplifiers that work as a row buffer are physically coupled with the bitlines such that a single command precharges both bitlines and sense amplifiers simultaneously. By decoupling the bitlines from the row buffers using isolation transistors, the bitlines can be precharged right after a row becomes activated. Therefore, only the sense amplifiers need to be precharged for a miss in most cases, taking an order of magnitude shorter time than the conventional deactivation process. Second, we show that this row-buffer decoupling enables internal DRAM μ-operations to be separated and recombined, which can be exploited by memory controllers to make the main memory system more energy efficient. Our experiments demonstrate that row-buffer decoupling improves the geometric mean of the instructions per cycle and MIPS~2/W by 14% and 29%, respectively, for memory-intensive SPEC CPU2006 applications.
机译:用于主存储器的现代DRAM设备具有多个存储库,可以满足不断增长的吞吐量,能效和容量需求。由于严格的成本限制,每个存储库只能缓冲(打开)一行并且一次有效地服务请求,而必须先停用(关闭)该行,然后再将新行存储到行缓冲区中。急速停用会不必要地重新打开行,否则行缓冲区命中,而后见之明伴随着访问行缓冲区未命中数据的关键路径上的停用过程。当应用程序通常对DRAM延迟敏感时,(取消)激活一行的时间与读取一个打开的行的时间相当。因此,对何时关闭行做出正确的决定至关重要。但是,随着代数的增加,每个DRAM设备的存储体数量减少了,每个存储体的请求数量减少了。由于缺少有关未来请求的信息,这迫使内存控制器频繁地预测何时关闭行,而内存访问模式的动态本质限制了预测的准确性。在本文中,我们提出了一种新颖的DRAM微体系结构,可以消除任何预测的需要。首先,我们确定对位线进行预充电在去激活时间中占主导地位,而充当行缓冲器的读出放大器与位线物理耦合,从而单个命令可以同时对位线和读出放大器进行预充电。通过使用隔离晶体管将位线与行缓冲器解耦,可以在激活行后立即对位线进行预充电。因此,在大多数情况下,仅需要对感测放大器进行预充电就可以了,因此比传统的停用过程所需的时间短了一个数量级。其次,我们证明了这种行缓冲区解耦功能可以使内部DRAMμ操作得以分离和重组,内存控制器可以利用这些操作使主内存系统更加节能。我们的实验表明,对于内存密集型SPEC CPU2006应用,行缓冲区解耦将每个周期的指令的几何平均值和MIPS〜2 / W分别提高了14%和29%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号