Row-Buffer Decoupling: A Case for Low-Latency DRAM Microarchitecture

Seongil O; Young Hoon Son; Nam Sung Kim; Jung Ho Ahn

首页> 外文期刊>Computer architecture news >Row-Buffer Decoupling: A Case for Low-Latency DRAM Microarchitecture

【24h】

Row-Buffer Decoupling: A Case for Low-Latency DRAM Microarchitecture

机译：行缓冲区解耦：低延迟DRAM微体系结构的一种情况

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Modern DRAM devices for the main memory are structured to have multiple banks to satisfy ever-increasing throughput, energy-efficiency, and capacity demands. Due to tight cost constraints, only one row can be buffered (opened) per bank and actively service requests at a time, while the row must be deactivated (closed) before a new row is stored into the row buffers. Hasty deactivation unnecessarily re-opens rows for otherwise row-buffer hits while hindsight accompanies the deactivation process on the critical path of accessing data for row-buffer misses. The time to (de)activate a row is comparable to the time to read an open row while applications are often sensitive to DRAM latency. Hence, it is critical to make the right decision on when to close a row. However, the increasing number of banks per DRAM device over generations reduces the number of requests per bank. This forces a memory controller to frequently predict when to close a row due to a lack of information on future requests, while the dynamic nature of memory access patterns limits the prediction accuracy. In this paper, we propose a novel DRAM microarchitecture that can eliminate the need for any prediction. First, we identify that precharging the bitlines dominates the deactivate time, while sense amplifiers that work as a row buffer are physically coupled with the bitlines such that a single command precharges both bitlines and sense amplifiers simultaneously. By decoupling the bitlines from the row buffers using isolation transistors, the bitlines can be precharged right after a row becomes activated. Therefore, only the sense amplifiers need to be precharged for a miss in most cases, taking an order of magnitude shorter time than the conventional deactivation process. Second, we show that this row-buffer decoupling enables internal DRAM μ-operations to be separated and recombined, which can be exploited by memory controllers to make the main memory system more energy efficient. Our experiments demonstrate that row-buffer decoupling improves the geometric mean of the instructions per cycle and MIPS~2/W by 14% and 29%, respectively, for memory-intensive SPEC CPU2006 applications.

机译：用于主存储器的现代DRAM设备具有多个存储库，可以满足不断增长的吞吐量，能效和容量需求。由于严格的成本限制，每个存储库只能缓冲（打开）一行并且一次有效地服务请求，而必须先停用（关闭）该行，然后再将新行存储到行缓冲区中。急速停用会不必要地重新打开行，否则行缓冲区命中，而后见之明伴随着访问行缓冲区未命中数据的关键路径上的停用过程。当应用程序通常对DRAM延迟敏感时，（取消）激活一行的时间与读取一个打开的行的时间相当。因此，对何时关闭行做出正确的决定至关重要。但是，随着代数的增加，每个DRAM设备的存储体数量减少了，每个存储体的请求数量减少了。由于缺少有关未来请求的信息，这迫使内存控制器频繁地预测何时关闭行，而内存访问模式的动态本质限制了预测的准确性。在本文中，我们提出了一种新颖的DRAM微体系结构，可以消除任何预测的需要。首先，我们确定对位线进行预充电在去激活时间中占主导地位，而充当行缓冲器的读出放大器与位线物理耦合，从而单个命令可以同时对位线和读出放大器进行预充电。通过使用隔离晶体管将位线与行缓冲器解耦，可以在激活行后立即对位线进行预充电。因此，在大多数情况下，仅需要对感测放大器进行预充电就可以了，因此比传统的停用过程所需的时间短了一个数量级。其次，我们证明了这种行缓冲区解耦功能可以使内部DRAMμ操作得以分离和重组，内存控制器可以利用这些操作使主内存系统更加节能。我们的实验表明，对于内存密集型SPEC CPU2006应用，行缓冲区解耦将每个周期的指令的几何平均值和MIPS〜2 / W分别提高了14％和29％。

著录项

来源
《Computer architecture news》 |2014年第3期|337-348|共12页
作者
Seongil O; Young Hoon Son; Nam Sung Kim; Jung Ho Ahn;
展开▼
作者单位

Seoul National University;

Seoul National University;

University of Wisconsin-Madison;

Seoul National University;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Harvesting Row-Buffer Hits via Orchestrated Last-Level Cache and DRAM Scheduling for Heterogeneous Multicore Systems [J] . Song Yang, Alavoine Olivier, Lin Bill ACM Transactions on Design Automation of Electronic Systems . 2019,第1期

机译：通过策划的最后级别缓存和DRAM调度来收集行缓冲区命中，用于异构多核系统
2. Breaking Address Mapping Symmetry at Multi-levels of Memory Heirarchy to Reduce DRAM Row-buffer Conflicts [J] . Zhao Zhang, Zhichun Zhu and Xiaodong Zhang Journal of instruction-level parallelism . 2001,第1期

机译：在多层次的内存层次结构中打破地址映射对称性，以减少DRAM行缓冲区冲突
3. Viyojit: Decoupling Battery and DRAM Capacities for Battery-Backed DRAM [J] . Rajat Kateja, Anirudh Badam, Sriram Govindan, Computer architecture news . 2017,第2期

机译：Viyojit：电池支持的DRAM的电池和DRAM容量解耦
4. Row-Buffer Decoupling: A Case for Low-Latency DRAM Microarchitecture [C] . Seongil O, Young Hoon Son, Nam Sung Kim, ACM/IEEE International Symposium on Computer Architecture . 2014

机译：行缓冲区解耦：低延迟DRAM微架构的情况
5. Study and Analysis of Energy-Efficient DRAM-Cache with Unconventional Row-Buffer Size. [D] . Tshibangu, Nyunyi Marcus. 2016

机译：具有非常规行缓冲区大小的节能DRAM缓存的研究和分析。
6. In-DRAM Cache Management for Low Latency and Low Power 3D-Stacked DRAMs [O] . Ho Hyun Shin, Eui-Young Chung 2019

机译：用于低延迟和低功耗3D堆叠DRAM的DRAM中缓存管理
7. Long-time Low-latency Quantum Memory by Dynamical Decoupling [O] . Khodjasteh, Kaveh, Sastrawan, Jarrah, Hayes, David, 2013

机译：通过动态解耦实现长时间低延迟量子存储器
8. Low-Latency Science Exploration of Planetary Bodies: How ISS Might Be Used as Part of a Low-Latency Analog Campaign for Human Exploration. [R] . Thronson, H., Valinia, A., Bleacher, J., 2014

机译：行星体的低延迟科学探索：国际空间站可能如何被用作人类探索的低延迟模拟运动的一部分。

Row-Buffer Decoupling: A Case for Low-Latency DRAM Microarchitecture

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅