PHANTOM: predicting performance of parallel applications on large-scale parallel machines using a single node

Zhai Jidong; Chen Wenguang; Zheng Weimin

首页> 外文期刊>ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages >PHANTOM: predicting performance of parallel applications on large-scale parallel machines using a single node

【24h】

PHANTOM: predicting performance of parallel applications on large-scale parallel machines using a single node

机译：PHANTOM：使用单个节点预测大型并行计算机上并行应用程序的性能

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

For designers of large-scale parallel computers, it is greatly desired that performance of parallel applications can be predicted at the design phase. However, this is difficult because the execution time of parallel applications is determined by several factors, including sequential computation time in each process, communication time and their convolution. Despite previous efforts, it remains an open problem to estimate sequential computation time in each process accurately and efficiently for large-scale parallel applications on non-existing target machines. This paper proposes a novel approach to predict the sequential computation time accurately and efficiently. We assume that there is at least one node of the target platform but the whole target system need not be available. We make two main technical contributions. First, we employ deterministic replay techniques to execute any process of a parallel application on a single node at real speed. As a result, we can simply measure the real sequential computation time on a target node for each process one by one. Second, we observe that computation behavior of processes in parallel applications can be clustered into a few groups while processes in each group have similar computation behavior. This observation helps us reduce measurement time significantly because we only need to execute representative parallel processes instead of all of them. We have implemented a performance prediction framework, called PHANTOM,which integrates the above computation-time acquisition approach with a trace-driven network simulator. We validate our approach on several platforms. For ASCI Sweep3D, the error of our approach is less than 5% on 1024 processor cores. Compared to a recent regression-based prediction approach, PHANTOM presents better prediction accuracy across different platforms.

机译：对于大型并行计算机的设计人员，非常希望可以在设计阶段预测并行应用程序的性能。但是，这很困难，因为并行应用程序的执行时间由几个因素决定，包括每个进程中的顺序计算时间，通信时间及其卷积。尽管有先前的努力，对于不存在的目标机器上的大规模并行应用，准确而有效地估计每个过程中的顺序计算时间仍然是一个未解决的问题。本文提出了一种新的方法来准确有效地预测顺序计算时间。我们假定目标平台至少有一个节点，但是整个目标系统不需要可用。我们做出两项主要的技术贡献。首先，我们采用确定性重播技术以实际速度在单个节点上执行并行应用程序的任何进程。结果，我们可以简单地逐个测量每个进程在目标节点上的实际顺序计算时间。其次，我们观察到并行应用程序中的进程的计算行为可以分为几组，而每组中的进程具有相似的计算行为。此观察结果有助于我们显着减少测量时间，因为我们只需要执行代表性的并行过程即可，而不是全部执行。我们已经实现了一个称为PHANTOM的性能预测框架，该框架将上述计算时间获取方法与跟踪驱动的网络模拟器集成在一起。我们在多个平台上验证了我们的方法。对于ASCI Sweep3D，在1024个处理器内核上，我们的方法误差小于5％。与最近的基于回归的预测方法相比，PHANTOM在不同平台上的预测精度更高。

著录项

来源
《ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages》 |2010年第5期|共10页
作者
Zhai Jidong; Chen Wenguang; Zheng Weimin;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算机软件;
关键词
deterministic replay; parallel application; performance prediction; trace-driven simulation;

机译：确定性重放;并行应用;性能预测;跟踪驱动的仿真;

相似文献

外文文献
中文文献
专利

1. PHANTOM: predicting performance of parallel applications on large-scale parallel machines using a single node [J] . Zhai Jidong, Chen Wenguang, Zheng Weimin ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages . 2010,第5期

机译：PHANTOM：使用单个节点预测大型并行计算机上并行应用程序的性能
2. Performance Analysis of Homogeneous On-Chip Large-Scale Parallel Computing Architectures for Data-Parallel Applications [J] . XiaowenChen, ZhonghaiLu, AxelJantsch, Journal of Electrical and Computer Engineering . 2015,第1期

机译：数据并行应用程序的同类片上大规模并行计算体系结构的性能分析
3. Predicting parallel application performance via machine learning approaches [J] . Karan Singh, Engin Ipek, Sally A. McKee, Concurrency and Computation . 2007,第17期

机译：通过机器学习方法预测并行应用程序性能
4. Phantom: Predicting Performance of Parallel Applications on Large-Scale Parallel Machines Using a Single Node [C] . Jidong Zhai, Wenguang Chen, Weimin Zheng Principles and practice of parallel programming . 2010

机译：Phantom：使用单个节点预测大型并行计算机上并行应用程序的性能
5. A journey through performance evaluation, tuning, and analysis of parallelized applications and parallel architectures: Quantitative approach. [D] . Mustafa, Dheya G. 2013

机译：并行应用程序和并行体系结构的性能评估，调整和分析的过程：定量方法。
6. Design of high-performance parallelized gene predictors in MATLAB [O] . Sylvain Robert Rivard, Jean-Gabriel Mailloux, Rachid Beguenane, 2012

机译：MATLAB中高性能并行基因预测器的设计
7. Performance Analysis of Homogeneous On-Chip Large-Scale Parallel Computing Architectures for Data-Parallel Applications [O] . Xiaowen Chen, Zhonghai Lu, Axel Jantsch, 2015

机译：用于数据并行应用的均匀芯片大规模平行计算架构的性能分析

PHANTOM: predicting performance of parallel applications on large-scale parallel machines using a single node

摘要

著录项

相似文献

相关主题

期刊订阅