...
首页> 外文期刊>IEEE transactions on very large scale integration (VLSI) systems >Overlay techniques for scratchpad memories in low power embedded processors
【24h】

Overlay techniques for scratchpad memories in low power embedded processors

机译:低功耗嵌入式处理器中暂存器的覆盖技术

获取原文
获取原文并翻译 | 示例
           

摘要

Energy consumption is one of the important parameters to be optimized during the design of portable embedded systems. Thus, most of the contemporary portable devices feature low-power processors coupled with on-chip memories (e.g., caches, scratchpads). Scratchpads are better than traditional caches in terms of power, performance, area, and predictability. However, unlike caches they depend upon software allocation techniques for their utilization. In this paper, we present scratchpad overlay techniques which analyze the application and insert instructions to dynamically copy both variables and code segments onto the scratchpad at runtime. We demonstrate that the problem of overlaying scratchpad is an extension of the Global Register Allocation problem. We present optimal and near-optimal approaches for solving the scratchpad overlay problem. The near-optimal scratchpad overlay approach achieves close to the optimal results and is significantly faster than the optimal approach. Our approaches improve upon the previously known static allocation technique for assigning both variables and code segments onto the scratchpad. The evaluation of the approaches for ARM7 processor reports, average energy, and execution time reductions of 26% and 14% over the static approach, respectively. Additional experiments comparing the overlayed scratchpads against unified caches of the same size, report average energy, and execution time savings of 20% and 10%, respectively. We also report data memory energy reductions of 45%-57% due to the insertion of a 1024-bytes scratchpad memory in the memory hierarchy of a digital signal processor (DSP).
机译:能耗是便携式嵌入式系统设计期间要优化的重要参数之一。因此,大多数当代便携式设备都具有低功耗处理器和片上存储器(例如,高速缓存,暂存器)。在功能,性能,面积和可预测性方面,Scratchpads比传统的缓存要好。但是,与缓存不同,它们依赖于软件分配技术来利用它们。在本文中,我们介绍了暂存器叠加技术,该技术可分析应用程序并插入指令以在运行时将变量和代码段动态复制到暂存器中。我们证明了覆盖暂存器的问题是全局寄存器分配问题的扩展。我们提出了用于解决暂存器覆盖问题的最佳方法和接近最佳方法。接近最佳的便笺本叠加方法可达到接近最佳结果的速度,并且比最佳方法要快得多。我们的方法改进了先前已知的用于将变量和代码段都分配到暂存器的静态分配技术。对ARM7处理器方法的评估报告显示,与静态方法相比,它们的平均能耗和执行时间分别减少了26%和14%。额外的实验将覆盖的暂存器与相同大小的统一缓存进行了比较,报告平均能耗和执行时间分别节省了20%和10%。我们还报告了由于在数字信号处理器(DSP)的存储器层次结构中插入1024字节暂存器而导致的数据存储器能耗降低了45%-57%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号