首页> 外文学位 >A tuning framework for software-managed memory hierarchies.
【24h】

A tuning framework for software-managed memory hierarchies.

机译:用于软件管理的内存层次结构的调整框架。

获取原文
获取原文并翻译 | 示例

摘要

New architectures are emerging at a rapid pace, architectures with multiple processing units on a chip and with deep memory hierarchies have become pervasive; while architectures with software-managed memory hierarchies (such as the Sony/Toshiba/IBM Cell processor) have gained popularity. Due to the increased complexity of architectures, re-targeting a legacy application to a new architecture requires lots of time porting and tuning. To achieve both portability and high performance on modern machines, we propose a programming environment that includes a portable language (Sequoia), a portable runtime and a tuning framework. In this thesis, we focus on the design and implementation of the tuning framework.;Achieving good performance on a modern machine with a multi-level memory hierarchy, and in particular on a machine with software-managed memories, requires the meticulous tuning of programs to the machine's particular characteristics. Further, the choices made when tuning a program for one machine will typically be very different to those made when tuning the same program for a different machine. A large program on a multi-level machine can easily expose tens or hundreds of inter-dependent parameters which require tuning, ranging (for example) from subarray sizes to compiler flags to loop optimizations to decomposition strategies, and manually searching the resultant large, non-linear space of program parameters is a tedious process of trial-and-error. These challenges entail the design of an automatic tuning framework.;In this dissertation, we present a general framework for automatically tuning arbitrary applications to machines with software-managed memory hierarchies. The tuning framework matches the decomposition strategies to the memory hierarchies. It uses a search algorithm, I specialized to software-managed memory hierarchies, that achieves good performance quickly due to the smoothness of the search space. The framework also applies a novel fusion algorithm that considers multiple outermost loop levels in a single step. The knowledge learned when searching the tunable space is used to guide the selection of a fusion configuration.;We evaluate our framework by measuring the performance of benchmarks that are tuned for a range of machines with different memory hierarchy configurations: a cluster of Intel P4 Xeon processors, a single Cell processor and a cluster of Sony Playstation 3s. The tuning framework gives similar or better performance than what is achieved by the best-available hand-tuned version coded in Sequoia.
机译:新的体系结构正在迅速兴起,具有多个芯片上处理单元和深层内存层次结构的体系结构已经普及。具有软件管理的内存层次结构的体系结构(例如Sony / Toshiba / IBM Cell处理器)已经普及。由于体系结构的复杂性不断提高,将旧版应用程序重新定位到新体系结构需要大量时间进行移植和调整。为了在现代机器上实现可移植性和高性能,我们提出了一种编程环境,其中应包括可移植语言(Sequoia),可移植运行时和调整框架。在本文中,我们着重于调优框架的设计和实现。在具有多级内存层次结构的现代计算机上,尤其是在具有软件管理的内存的计算机上,要获得良好的性能,需要对程序进行精心的调优根据机器的特殊特性。此外,为一台机器调整程序时所做的选择通常与为另一台机器调整同一程序时所做的选择大不相同。多级计算机上的大型程序可以轻松公开数十个或数百个相互依赖的参数,这些参数需要进行调整,范围从子数组大小到编译器标志,再到循环优化再到分解策略,并手动搜索结果,程序参数的线性空间是一个反复试验的繁琐过程。这些挑战需要设计一个自动调整框架。在本文中,我们提出了一个通用框架,用于自动调整具有软件管理的内存层次结构的计算机上的任意应用程序。调整框架将分解策略与内存层次结构进行匹配。它使用专门针对软件管理的内存层次结构的搜索算法,由于搜索空间的平滑性,该算法可快速实现良好的性能。该框架还应用了一种新颖的融合算法,该算法在单个步骤中考虑了多个最外面的循环级别。搜索可调空间时所学的知识将用于指导融合配置的选择。;我们通过测量基准的性能来评估我们的框架,这些基准针对具有不同内存层次结构的一系列机器进行了调整:Intel P4 Xeon群集处理器,单个Cell处理器和Sony Playstation 3s集群。与使用红杉编码的最佳可用手动调整版本相比,该调整框架提供了相似或更好的性能。

著录项

  • 作者

    Ren, Manman.;

  • 作者单位

    Stanford University.;

  • 授予单位 Stanford University.;
  • 学科 Engineering Electronics and Electrical.
  • 学位 Ph.D.
  • 年度 2009
  • 页码 113 p.
  • 总页数 113
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 无线电电子学、电信技术 ;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号