Enabling PGAS Productivity with Hardware Support for Shared Address Mapping: A UPC Case Study

机译：使用共享地址映射的硬件支持来提高PGAS生产率：UPC案例研究

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

The Partitioned Global Address Space (PGAS) programming model strikes a balance between the locality-aware, but explicit, message-passing model (e.g. MPI) and the easy-to-use, but locality-agnostic, shared memory model (e.g. OpenMP). However, the PGAS rich memory model comes at a performance cost which can hinder its potential for scalability and performance. To contain this overhead and achieve full performance, compiler optimizations may not be sufficient and manual optimizations are typically added. This, however, can severely limit the productivity advantage. Such optimizations are usually targeted at reducing address translation overheads for shared data structures. This paper proposes a hardware architectural support for PGAS, which allows the processor to efficiently handle shared addresses. This eliminates the need for such hand-tuning, while maintaining the performance and productivity of PGAS languages. We propose to avail this hardware support to compilers by introducing new instructions to efficiently access and traverse the PGAS memory space. A prototype compiler is realized by extending the Berkeley Unified Parallel C (UPC) compiler. It allows unmodified code to use the new instructions without the user intervention, thereby creating a real productive programming environment. Two different implementations of the system are realized: the first is implemented using the full system simulator Gem5, which allows the evaluation of the performance gain. The second is implemented using a soft core processor Leon3 on an FPGA to verify the implement ability and to parameterize the cost of the new hardware and its instructions. The new instructions show promising results for the NAS Parallel Benchmarks implemented in UPC. A speedup of up to 5.5x is demonstrated for unmodified codes. Unmodified code performance using this hardware was shown to also surpass the performance of manually optimized code by up to 10%.

机译：分区全局地址空间（PGAS）编程模型在可识别位置的，但显式的消息传递模型（例如MPI）和易于使用但与位置无关的共享内存模型（例如OpenMP）之间取得了平衡。但是，PGAS丰富的内存模型的性能成本可能会阻碍其扩展性和性能的潜力。为了控制这些开销并获得完整的性能，编译器优化可能不够，通常需要添加手动优化。但是，这会严重限制生产率优势。此类优化通常旨在减少共享数据结构的地址转换开销。本文提出了对PGAS的硬件架构支持，该支持使处理器能够有效地处理共享地址。这消除了对此类手动调整的需求，同时保持了PGAS语言的性能和生产率。我们建议通过引入新指令来有效地访问和遍历PGAS存储器空间，从而为编译器提供这种硬件支持。通过扩展Berkeley统一并行C（UPC）编译器来实现原型编译器。它允许未经修改的代码在无需用户干预的情况下使用新指令，从而创建了一个真正高效的编程环境。实现了系统的两种不同实现：第一种是使用完整的系统模拟器Gem5实现的，该模拟器允许评估性能增益。第二种是使用FPGA上的软核处理器Leon3来实现的，以验证实现能力并参数化新硬件及其指令的成本。新指令显示了在UPC中实施的NAS并行基准测试的可喜结果。对于未经修改的代码，显示出高达5.5倍的加速。使用该硬件的未修改代码性能也显示出比手动优化代码的性能高出10％。

著录项

来源
《2014 IEEE Intl Conf on High Performance Computing and Communications, 2014 IEEE 6th Intl Symp on Cyberspace Safety and Security, 2014 IEEE 11th Intl Conf on Embedded Software and Syst》|2014年|1-10|共10页
会议地点 Paris(FR)
作者
Serres Olivier; Kayi Abdullah; Anbar Ahmad; El Ghazawi Tarek;
展开▼
作者单位

George Washington Univ., Washington, DC, USA;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
High Performance Computing; Parallel architectures; Parallel programming;

机译：高性能计算;并行架构;并行编程;

相似文献

外文文献
中文文献
专利

1. Enabling PGAS Productivity with Hardware Support for Shared Address Mapping: A UPC Case Study [J] . Serres Olivier, Kayi Abdullah, Anbar Ahmad, ACM Transactions on Architecture and Code Optimization . 2015,第4期

机译：UPC案例研究：通过硬件支持共享地址映射来提高PGAS生产率
2. Applying frame layout to hardware design in FPGA for seamless support of cross calls in CPU-FPGA coupling architecture [J] . Giang Nguyen Thi Huong, Yeoul Na, Seon Wook Kim Microprocessors and microsystems . 2011,第5期

机译：将帧布局应用于FPGA中的硬件设计，以无缝支持CPU-FPGA耦合架构中的交叉调用
3. NestedMP: Enabling cache-aware thread mapping for nested parallel shared memory applications [J] . He Jiangzhou, Chen Wenguang, Tang Zhizhong Parallel Computing . 2016,第Jana期

机译：NestedMP：为嵌套的并行共享内存应用程序启用缓存感知线程映射
4. Enabling PGAS Productivity with Hardware Support for Shared Address Mapping: A UPC Case Study [C] . Serres Olivier, Kayi Abdullah, Anbar Ahmad, IEEE Intl Conf on High Performance Computing and Communications;IEEE International Conference on mbedded Software and Systems;International Symposium on Cyberspace Safety and Security . 2014

机译：启用PGA的生产率与共享地址映射的硬件支持：UPC案例研究
5. Hardware Support for Productive Partitioned Global Address Space (PGAS) Programming. [D] . Serres, Olivier. 2016

机译：对生产性分区全局地址空间（PGAS）编程的硬件支持。
6. Enabling single qubit addressability in a molecular semiconductor comprising gold-supported organic radicals [O] . Jake McGuire, Haralampos N. Miras, Emma Richards, 2019

机译：在包含金支撑的有机基团的分子半导体中实现单量子位可寻址性
7. Enabling PGAS Productivity with Hardware Support for Shared Address Mapping [O] . Olivier Serres, Abdullah Kayi, Ahmad Anbar, 2016

机译：启用PGA的生产率与共享地址映射的硬件支持

Enabling PGAS Productivity with Hardware Support for Shared Address Mapping: A UPC Case Study

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅