MOPED: Orchestrating interprocess message data on CMPs

机译：嘲笑：CMPS上协调进程消息数据

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Future CMPs will combine many simple cores with deep cache hierarchies. With more cores, cache resources per core are fewer, and must be shared carefully to avoid poor utilization due to conflicts and pollution. Explicit motion of data in these architectures, such as message passing, can provide hints about program behavior that can be used to hide latency and improve cache behavior. However, to make these models attractive, synchronization overhead and data copying must also be offloaded from the processors. In this paper, we describe a Message Orchestration and Performance Enhancement Device (MOPED) that provides hardware mechanisms to support state-of-the-art message passing protocols such as MPI. MOPED extends the per-processor cache controllers and coherence protocol to support message synchronization and management in hardware, to transfer message data efficiently without intermediate buffer copies, and to place useful data in caches in a timely manner. MOPED thus allows full overlap between communication and computation on the cores. We extended a 16-core full-system simulator based on Simics and FeS2. MOPED interacts with the directory controllers to orchestrate message data. We evaluated benefits to performance and coherence traffic by integrating MOPED into the MPICH runtime. Relative to unmodified MPI execution, MOPED reduces execution time of real applications (NAS Parallel Benchmarks) by 17–45% and of communication microbenchmarks (Intel's IMB) by 76–94%. Off-chip memory misses are reduced by 43–88% for applications and by 75–100% for microbenchmarks.

机译：未来CMP将结合许多具有深度缓存层次结构的简单核心。通过更多核心，每个核心的缓存资源更少，并且必须仔细共享以避免由于冲突和污染而利用差。这些架构中的数据显式运动，例如消息传递，可以提供关于可用于隐藏延迟和提高缓存行为的程序行为的提示。但是，为了使这些模型具有吸引力，同步开销和数据复制也必须从处理器卸载。在本文中，我们描述了一种消息编排和性能增强设备（MOPED），提供硬件机制，以支持最先进的消息传递诸如MPI的协议。 MOPED扩展了每个处理器高速缓存控制器和一致性协议，以支持硬件中的消息同步和管理，以有效地在没有中间缓冲区副本的情况下有效地传输消息数据，并及时地将高速缓存中的有用数据放置。因此，在核心上允许在通信和计算之间完全重叠。我们基于SIMICS和FES2扩展了一个16核心全系统模拟器。嘲笑与目录控制器交互以协调消息数据。我们通过将MOPED锁定到MPICH运行时，我们评估了对性能和连贯性交通的益处。相对于未修改的MPI执行，嘲笑将实际应用程序（NAS并行基准）的执行时间减少17-45％，并且通信微币（Intel的IMB）达到76-94％。芯片内存未命中的应用程序减少了43-88％，微不足道的75-100％。

著录项

来源
《IEEE International Symposium on High Performance Computer Architecture》|2011年||共10页
会议地点
作者
Gu Junli; Lumetta Steven S.; Kumar Rakesh; Sun Yihe;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP3-53;
关键词

相似文献

外文文献
中文文献
专利

1. MOPED: Accelerating Data Communication on Future CMPs [J] . Gu Junli, Sun Yihe, Lumetta Steven S., Micro, IEEE . 2011,第4期

机译：MOPED：加速未来CMP的数据通信
2. Exploiting address compression and heterogeneous interconnects for efficient message management in tiled CMPs [J] . Flores A., Acacio M.E., Aragón J.L. Journal of systems architecture . 2010,第9期

机译：利用地址压缩和异构互连以在切片CMP中进行有效的消息管理
3. One-third of moped riders own tuned moped [J] . Fridulv Sagberg, Ole J?rgen Johansson Nordic Road & Transport Research . 2018,第2期

机译：三分之一的轻便摩托车骑手拥有经过改装的轻便摩托车
4. MOPED: Orchestrating interprocess message data on CMPs [C] . Gu Junli, Lumetta Steven S., Kumar Rakesh, IEEE 17th International Symposium on High Performance Computer Architecture . 2011

机译：MOPED：在CMP上编排进程间消息数据
5. Efficient and High-Performance Data Orchestration for Large Scale Cloud Workloads [D] . Chen, Shouwei. 2021

机译：用于大型云工作负载的高效和高性能数据编排
6. MOPED 2.5—An Integrated Multi-Omics Resource: Multi-Omics Profiling Expression Database Now Includes Transcriptomics Data [O] . Elizabeth Montague, Larissa Stanberry, Roger Higdon, -1

机译：MOPED 2.5-集成的多组学资源：多组学分析表达数据库现在包括转录组学数据
7. MOPED: Orchestrating Interprocess Message Data on CMPs [O] . Junli Gu 2013

机译：mOpED：在Cmp上编排进程间消息数据

MOPED: Orchestrating interprocess message data on CMPs

摘要

著录项

相似文献

相关主题

期刊订阅