Software support for ordering memory operations in parallel systems.

机译：对并行系统中的内存操作进行排序的软件支持。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Parallel processing is essential to exploiting the potential of multi-core processors. Correct and efficient programming for parallel machines is a notoriously difficult job done well by only a few select, well-trained programmers. However, parallel platforms are becoming ubiquitous, requiring far more programs to be written by regular programmers. This motivates the implementation of new parallel programming paradigms that are efficient and easy to reason about and use.;Modern processors implement relaxed memory models when used as part of a shared memory system, that is, one where loads and stores that do not reference the same memory location are allowed to execute in a different order than they appear in the program. Programming languages implement memory (or consistency) models that require other memory references to be executed in order, beyond those guaranteed to execute in order by the relaxed consistency model processor, i.e., they have a stricter memory model. An extreme example of a stricter memory model is the sequentially consistent memory model. A stricter model is thought by many to be easier to reason about than a relaxed model.;Current processors provide fence instructions that allow these stricter orders to be enforced. We present a flow-based fence insertion algorithm for effectively enforcing the orders required. This algorithm is implemented in the Pensieve-Jikes compiler. Data showing the effectiveness of the algorithm is provided.;New architectures have been proposed that aim to support high performance sequential consistency by committing groups of instructions (chunks) at one time. Aggressive compiler support is required to break programs into reasonable sized groups at strategic places, attaining a high-performance sequentially consistent environment. In the second half of this dissertation we present in detail a compiler algorithm and implementation that performs full-program automatic formation of chunks for such a blocked architecture. We show, for the first time, that fully automatic techniques with no programmer intervention provide a sequentially-consistent system that has a higher performance than conventional machines with relaxed memory models. For 8 full Java codes, we show that compiler generated code running on a simulated 4-processor blocked architecture and supporting sequential consistency, runs on average 5% faster than code on a conventional architecture supporting the more relaxed Java memory model.

机译：并行处理对于挖掘多核处理器的潜力至关重要。对于并行机进行正确而有效的编程是一项众所周知的艰巨任务，只有少数经过精心挑选和训练有素的程序员才能完成。但是，并行平台正变得无处不在，需要普通程序员编写更多的程序。这激励了新的并行编程范例的实施，这些范例高效且易于推理和使用。;现代处理器在用作共享内存系统的一部分时即实现了宽松的内存模型，即在其中加载和存储未引用内存的模型。允许相同的内存位置以与程序中出现的顺序不同的顺序执行。编程语言实现的内存（或一致性）模型要求按顺序执行其他内存引用，而不是宽松的一致性模型处理器保证按顺序执行的那些内存引用，即它们具有更严格的内存模型。严格的内存模型的一个极端示例是顺序一致的内存模型。许多人认为，更严格的模型比宽松的模型更容易推理。；当前的处理器提供了围栏指令，可以强制执行这些更严格的命令。我们提出了一种基于流程的围栏插入算法，可以有效地执行所需的订单。该算法在Pensieve-Jikes编译器中实现。提供了表明该算法有效性的数据。提出了旨在通过一次提交指令组（块）来支持高性能顺序一致性的新体系结构。需要积极的编译器支持，才能在战略位置将程序分成合理大小的组，以实现高性能，顺序一致的环境。在本文的后半部分，我们详细介绍了一种编译器算法和实现，该算法和实现针对这种受阻止的体系结构执行块的全程序自动形成。我们首次展示了无需程序员干预的全自动技术所提供的顺序一致的系统，其性能要比具有宽松内存模型的传统机器更高。对于8个完整的Java代码，我们显示出编译器生成的代码在模拟的4处理器阻塞架构上运行并支持顺序一致性，其运行速度比在支持更宽松的Java内存模型的常规架构上的代码平均快5％。

著录项

作者
Fang, Xing.;
展开▼
作者单位

Purdue University.;

展开▼
授予单位 Purdue University.;
学科 Engineering Computer.
学位 Ph.D.
年度 2012
页码 115 p.
总页数 115
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Software-Based Parallel Cryptographic Solution with Massive-Parallel Memory-Embedded SIMD Matrix Architecture for Data-Storage Systems [J] . Takeshi KUMAKI, Tetsushi KOIDE, Hans Jurgen MATTAUSCH, IEICE Transactions on Information and Systems . 2011,第9期

机译：基于软件的基于并行并行加密解决方案的大规模并行内存嵌入式SIMD矩阵体系结构，用于数据存储系统
2. Software-Based Parallel Cryptographic Solution with Massive-Parallel Memory-Embedded SIMD Matrix Architecture for Data-Storage Systems [J] . Takeshi KUMAKI, Tetsushi KOIDE, Hans Jürgen MATTAUSCH, IEICE transactions on information and systems . 2011,第9期

机译：基于软件的并行加密解决方案，具有用于数据存储系统的大规模并行存储器嵌入式SIMD矩阵架构
3. Parallelizing Sequential Applications on Commodity Hardware using a Low-cost Software Transactional Memory [J] . Mehrara M, Hao J, Hsu PC, ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2009,第6期

机译：使用低成本软件事务性存储器并行化商品硬件上的顺序应用程序
4. The Implementation of a Compiler Controlled Software Distributed Shared Memory System "Fagus" as a Runtime Support System for Automatic Parallelizing Compilers [C] . Shoichi Salto, Satoshi Yokote, Tetsutaro Uehara, International Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA'2001) Vol.3, Jun 25-28, 2001, Las Vegas, Nevada, USA . 2001

机译：编译器控制的软件分布式共享内存系统“ Fagus”作为自动并行编译器的运行时支持系统的实现
5. Efficient run-time support for global view programming of linked data structures on distributed memory parallel systems. [D] . Larkins, Darrell Brian. 2010

机译：对分布式内存并行系统上链接数据结构的全局视图编程的有效运行时支持。
6. Parallel Regulation of Memory and Emotion Supports the Suppression of Intrusive Memories [O] . Pierre Gagnepain, Justin Hulbert, Michael C. Anderson 2017

机译：记忆与情绪的平行调节可支持侵入性记忆的抑制
7. Methods of analysis and modeling activities of operators in the process of ergonomic software development and operation of man-machine systems. [O] . Валерий Спасенников, Valeriy Spasennikov, Сергей Кондратенко, 2015

机译：人体工程学软件开发过程中运营商的分析与建模活动方法。
8. Parallelization of a multiregion flow and transport code using software emulated global shared memory and high performance FORTRAN [R] . D'Azevedo, E. F. , Gwo, J. P. 1997

机译：使用软件模拟全局共享内存和高性能FORTRaN并行化多区域流和传输代码

Software support for ordering memory operations in parallel systems.

摘要

著录项

相似文献

相关主题

期刊订阅