首页> 外文学位 >Data placement optimizations for multilevel cache hierarchies.
【24h】

Data placement optimizations for multilevel cache hierarchies.

机译:针对多级缓存层次结构的数据放置优化。

获取原文
获取原文并翻译 | 示例

摘要

As compiler optimizations have increasingly focused on the memory hierarchy, a variety of efforts have attempted to reduce cache misses in first level instruction and data caches. Placement of code to reduce instruction cache misses, and placement of data to reduce data cache misses, have been demonstrated to be beneficial for a variety of application programs. However, most of this work has been limited to reduction of first-level cache misses. Careful examination of various characteristics of modern computer architectures reveals opportunities for a data placement optimization framework that targets several means of performance improvement at once. Cache hierarchies have recently extended as deep as three levels, each with different cache miss penalties. Cache misses need to be reduced at all cache levels to maximize performance. Reducing TLB (translation lookaside buffer) misses and virtual memory page use is also desirable. Addressing of global and local variables can use addressing modes of differing costs, and the less expensive addressing modes can be used more frequently if the data placement optimization considers this goal.; A multi-goal data placement framework has been developed to enable all of these optimizations. Through a novel method of static data affinity analysis, followed by a data placement optimization that uses hierarchical graph partitioning and local refinement, it is possible to achieve reductions in cache misses throughout the cache hierarchy, while also increasing page and TLB locality and enabling the address mode and bus cycle optimizations. An original method of characterizing the parameters of the cache and TLB hierarchy that are needed for the profiling and optimizations, using hardware performance counters, helps make the entire data placement framework practical and portable. The static data affinity analysis avoids the practical difficulties inherent in past research that relied on expensive dynamic profiling runs. The hierarchical graph partitioning approach to data placement is able to make use of Chaco, a well tested, off the shelf graph partitioning code library. Extensive measurements using timings and cache simulations for Sun UltraSparc-II machines demonstrate the effectiveness of the data placement optimizations.
机译:随着编译器优化越来越关注存储器层次结构,已进行了各种努力来减少第一级指令和数据高速缓存中的高速缓存未命中。减少代码缓存丢失的代码放置和减少数据缓存丢失的数据放置已被证明对各种应用程序都是有益的。但是,大部分工作仅限于减少一级缓存未命中。仔细检查现代计算机体系结构的各种特征后,我们发现了针对数据放置优化框架的机会,该框架可同时针对多种性能改进手段。高速缓存层次结构最近已扩展到三个级别,每个级别具有不同的高速缓存未命中罚款。需要在所有高速缓存级别上减少高速缓存未命中,以最大程度地提高性能。减少TLB(转换后备缓冲区)丢失和虚拟内存页面使用也是理想的。全局变量和局部变量的寻址可以使用成本不同的寻址模式,如果数据放置优化考虑到此目标,则可以更频繁地使用较便宜的寻址模式。已经开发了一个多目标数据放置框架来实现所有这些优化。通过一种新颖的静态数据亲和力分析方法,再通过使用分层图分区和局部优化的数据放置优化,可以在整个缓存层次结构中减少缓存未命中率,同时还可以增加页面和TLB的局部性并启用地址模式和总线周期优化。使用硬件性能计数器来表征分析和优化所需的缓存和TLB层次结构参数的原始方法,有助于使整个数据放置框架实用且可移植。静态数据亲和力分析避免了以往研究中固有的实际困难,后者依赖于昂贵的动态概要分析运行。用于数据放置的分层图分区方法能够利用Chaco(经过良好测试的现成图分区代码库)。使用Sun UltraSparc-II机器的时序和缓存模拟进行的大量测量证明了数据放置优化的有效性。

著录项

  • 作者

    Coleman, Clark L.;

  • 作者单位

    University of Virginia.;

  • 授予单位 University of Virginia.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2004
  • 页码 172 p.
  • 总页数 172
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号