首页> 外文学位 >Software support for ordering memory operations in parallel systems.
【24h】

Software support for ordering memory operations in parallel systems.

机译:对并行系统中的内存操作进行排序的软件支持。

获取原文
获取原文并翻译 | 示例

摘要

Parallel processing is essential to exploiting the potential of multi-core processors. Correct and efficient programming for parallel machines is a notoriously difficult job done well by only a few select, well-trained programmers. However, parallel platforms are becoming ubiquitous, requiring far more programs to be written by regular programmers. This motivates the implementation of new parallel programming paradigms that are efficient and easy to reason about and use.;Modern processors implement relaxed memory models when used as part of a shared memory system, that is, one where loads and stores that do not reference the same memory location are allowed to execute in a different order than they appear in the program. Programming languages implement memory (or consistency) models that require other memory references to be executed in order, beyond those guaranteed to execute in order by the relaxed consistency model processor, i.e., they have a stricter memory model. An extreme example of a stricter memory model is the sequentially consistent memory model. A stricter model is thought by many to be easier to reason about than a relaxed model.;Current processors provide fence instructions that allow these stricter orders to be enforced. We present a flow-based fence insertion algorithm for effectively enforcing the orders required. This algorithm is implemented in the Pensieve-Jikes compiler. Data showing the effectiveness of the algorithm is provided.;New architectures have been proposed that aim to support high performance sequential consistency by committing groups of instructions (chunks) at one time. Aggressive compiler support is required to break programs into reasonable sized groups at strategic places, attaining a high-performance sequentially consistent environment. In the second half of this dissertation we present in detail a compiler algorithm and implementation that performs full-program automatic formation of chunks for such a blocked architecture. We show, for the first time, that fully automatic techniques with no programmer intervention provide a sequentially-consistent system that has a higher performance than conventional machines with relaxed memory models. For 8 full Java codes, we show that compiler generated code running on a simulated 4-processor blocked architecture and supporting sequential consistency, runs on average 5% faster than code on a conventional architecture supporting the more relaxed Java memory model.
机译:并行处理对于挖掘多核处理器的潜力至关重要。对于并行机进行正确而有效的编程是一项众所周知的艰巨任务,只有少数经过精心挑选和训练有素的程序员才能完成。但是,并行平台正变得无处不在,需要普通程序员编写更多的程序。这激励了新的并行编程范例的实施,这些范例高效且易于推理和使用。;现代处理器在用作共享内存系统的一部分时即实现了宽松的内存模型,即在其中加载和存储未引用内存的模型。允许相同的内存位置以与程序中出现的顺序不同的顺序执行。编程语言实现的内存(或一致性)模型要求按顺序执行其他内存引用,而不是宽松的一致性模型处理器保证按顺序执行的那些内存引用,即它们具有更严格的内存模型。严格的内存模型的一个极端示例是顺序一致的内存模型。许多人认为,更严格的模型比宽松的模型更容易推理。;当前的处理器提供了围栏指令,可以强制执行这些更严格的命令。我们提出了一种基于流程的围栏插入算法,可以有效地执行所需的订单。该算法在Pensieve-Jikes编译器中实现。提供了表明该算法有效性的数据。提出了旨在通过一次提交指令组(块)来支持高性能顺序一致性的新体系结构。需要积极的编译器支持,才能在战略位置将程序分成合理大小的组,以实现高性能,顺序一致的环境。在本文的后半部分,我们详细介绍了一种编译器算法和实现,该算法和实现针对这种受阻止的体系结构执行块的全程序自动形成。我们首次展示了无需程序员干预的全自动技术所提供的顺序一致的系统,其性能要比具有宽松内存模型的传统机器更高。对于8个完整的Java代码,我们显示出编译器生成的代码在模拟的4处理器阻塞架构上运行并支持顺序一致性,其运行速度比在支持更宽松的Java内存模型的常规架构上的代码平均快5%。

著录项

  • 作者

    Fang, Xing.;

  • 作者单位

    Purdue University.;

  • 授予单位 Purdue University.;
  • 学科 Engineering Computer.
  • 学位 Ph.D.
  • 年度 2012
  • 页码 115 p.
  • 总页数 115
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号