Many security-aware mobile devices, using the secure hash algorithm (SHA) or the advanced encryption standard (AES) for data encryption, require short read-access time (tAC) and wide-IO from nonvolatile memory (NVM) for high-read bandwidth and SHA/AES shift/rotate functions. STT-MRAM is the major on-chip NVM for advanced process nodes [2]–[6]; however, it requires small-offset sense amplifiers (SAs) for robust reads, against a small TMR-ratio, at the expense of large area overhead and read energy (ERD). As Fig. 13.4.1 shows, designing STT-MRAM macros for security-related applications imposes three main challenges. (1) Using a large number of SAs for wide parallel-IO readout to achieve a short tAC, but this results in a high peak current lPEAK and a large area overhead. Using fewer SAs for sequential wide-IO readout reduces lPEAK and area overhead, but imposes long tAC and a low read bandwidth (BWR). (2) MRAM macros with a high lPEAK degrade the supply (VDD) integrity of the chip, often leading to failure in noise-sensitive blocks on the same chip. (3) A conventional memory-logic-separated scheme imposes a long latency (2 cycles: wide-IO memory read + flip-flop (FF) shift/rotate) for NVM-based security logic operations. This paper presents a multibit current-mode SA (MB-CSA) for a high BWR with a short tAC and a low lPEAK. Also presented is a near-memory computing (NMC) unit with a 1-cycle access, to speed up computing for security applications. This work resulted in a 22nm 1 Mb STT-MRAM macro with dual-mode operations: wide-IO memory and NMC. The proposed 1 Mb macro demonstrates the largest number of data-out operations (1024b) with a tAC of 275ns using a 0.85V supply. In memory mode, this device outperformed all reported NVM macros in terms of BWR (42.67GB/s) and ERD(0.23pJ/b. This work also presents the first MRAM macro with NMC functionality, a 33.3% reduction in logic area, and only a 170ps latency, after NVM access, for 1 b shift/rotate operations.
展开▼
机译:使用安全哈希算法(SHA)或高级加密标准(AES)进行数据加密的许多具有安全意识的移动设备都需要较短的读取访问时间(t
AC inf>
)和来自非易失性存储器(NVM)的Wide-IO,以实现高读取带宽和SHA / AES移位/旋转功能。 STT-MRAM是高级过程节点的主要片上NVM [2] – [6];但是,它需要小偏移量的读出放大器(SA),以实现相对于TMR较小的鲁棒读取,但要以大面积开销和读取能量为代价(E
RD inf>
)。如图13.4.1所示,为与安全相关的应用程序设计STT-MRAM宏带来了三个主要挑战。 (1)使用大量SA进行广泛的并行IO读取以实现较短的t
AC inf>
,但这会导致较高的峰值电流l
峰值 inf>
和大面积的开销。使用较少的SA进行连续的宽IO读取可减少l
峰值 inf>
和面积开销,但要花很长的时间
AC inf>
和低读取带宽(BW
R inf>
)。 (2)具有高l的MRAM宏
峰值 inf>
降低电源(V
DD inf>
)芯片的完整性,通常会导致同一芯片上对噪声敏感的模块出现故障。 (3)传统的内存逻辑分离方案为基于NVM的安全逻辑操作施加了较长的延迟(2个周期:宽IO内存读取+触发器(FF)移位/旋转)。本文提出了一种用于高带宽的多位电流模式SA(MB-CSA)
R inf>
t短
AC inf>
低l
峰值 inf>
。还介绍了一种具有1周期访问权限的近内存计算(NMC)单元,以加快安全应用程序的计算速度。这项工作产生了具有双模式操作的22nm 1 Mb STT-MRAM宏:宽IO存储器和NMC。拟议的1 Mb宏演示了最大的数据输出操作(1024b),t
AC inf>
使用0.85V电源时为275ns。在内存模式下,该设备在带宽方面胜过所有报告的NVM宏
R inf>
(42.67GB / s)和E
RD inf>
(0.23pJ / b。这项工作还展示了第一个具有NMC功能的MRAM宏,在进行NVM访问后,对于1b移位/旋转操作,逻辑区域减少了33.3%,并且只有170ps的延迟。
展开▼