【24h】

Compiler-Decided Dynamic Memory Allocation for Scratch-Pad Based Embedded Systems

机译:基于草稿板的嵌入式系统的编译器决定的动态内存分配

获取原文
获取原文并翻译 | 示例

摘要

This paper presents a highly predictable, low overhead and yet dynamic, memory allocation strategy for embedded systems with scratch-pad memory. A scratch-pad is a fast compiler-managed SRAM memory that replaces the hardware-managed cache. It is motivated by its better real-time guarantees vs cache and by its significantly lower overheads in energy consumption, area and overall runtime, even with a simple allocation scheme. Existing scratch-pad allocation methods are of two types. First, software-caching schemes emulate the workings of a hardware cache in software. Instructions are inserted before each load/store to check the software-maintained cache tags. Such methods incur large overheads in runtime, code size, energy consumption and SRAM space for tags and deliver poor real-time guarantees just like hardware caches. A second category of algorithms partitions variables at compile-time into the two banks. For example, our previous work in derives a provably optimal static allocation for global and stack variables and achieves a speedup over all earlier methods. However, a drawback of such static allocation schemes is that they do not account for dynamic program behavior. It is easy to see why a data allocation that never changes at runtime cannot achieve the full locality benefits of a cache. In this paper we present a dynamic allocation method for global and stack data that for the first time, (ⅰ) accounts for changing program requirements at runtime (ⅱ) has no software-caching tags (ⅲ) requires no run-time checks (ⅳ) has extremely low overheads, and (ⅴ) yields 100% predictable memory access times. In this method data that is about to be accessed frequently is copied into the SRAM using compiler-inserted code at fixed and infrequent points in the program. Earlier data is evicted if necessary. When compared to a provably optimal static allocation our results show runtime reductions ranging from 11% to 38%, averaging 31.2%, using no additional hardware support. With hardware support for pseudo-DMA and full DMA, which is already provided in some commercial systems, the runtime reductions increase to 33.4% and 34.2% respectively.
机译:本文为具有暂存内存的嵌入式系统提供了一种高度可预测,低开销且动态的内存分配策略。暂存器是由编译器管理的快速SRAM存储器,它代替了硬件管理的缓存。即使是采用简单的分配方案,它的更好的实时保证与缓存,以及能耗,面积和总体运行时间的显着较低的开销,都促使了它的发展。现有的暂存器分配方法有两种。首先,软件缓存方案模拟了软件中硬件缓存的工作方式。在每次加载/存储之前插入说明,以检查软件维护的缓存标签。这样的方法会在运行时,代码大小,能耗和标签的SRAM空间方面产生大量开销,并且像硬件缓存一样,提供的实时保证也很差。第二类算法在编译时将变量划分为两个库。例如,我们之前的工作为全局变量和堆栈变量得出了可证明的最佳静态分配,并实现了所有早期方法的加速。但是,这种静态分配方案的缺点是它们不能解决动态程序行为。显而易见,为什么在运行时永不更改的数据分配无法获得高速缓存的全部位置优势。在本文中,我们提出了一种针对全局数据和堆栈数据的动态分配方法,该方法首次(ⅰ)考虑运行时更改程序要求(ⅱ)没有软件缓存标签(ⅲ)不需要运行时检查(ⅳ )具有极低的开销,并且(ⅴ)产生了100%可预测的内存访问时间。在这种方法中,将要频繁访问的数据在程序的固定点和不频繁点使用编译器插入的代码复制到SRAM中。如有必要,可以驱逐较早的数据。与可证明的最佳静态分配相比,我们的结果表明,在不使用其他硬件支持的情况下,运行时间减少了11%至38%,平均为31.2%。借助一些商业系统中已经提供的对伪DMA和全DMA的硬件支持,运行时间的减少分别增加到33.4%和34.2%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号