首页> 外文期刊>ACM SIGPLAN Notices: A Monthly Publication of the Special Interest Group on Programming Languages >PHANTOM: predicting performance of parallel applications on large-scale parallel machines using a single node
【24h】

PHANTOM: predicting performance of parallel applications on large-scale parallel machines using a single node

机译:PHANTOM:使用单个节点预测大型并行计算机上并行应用程序的性能

获取原文
获取原文并翻译 | 示例
           

摘要

For designers of large-scale parallel computers, it is greatly desired that performance of parallel applications can be predicted at the design phase. However, this is difficult because the execution time of parallel applications is determined by several factors, including sequential computation time in each process, communication time and their convolution. Despite previous efforts, it remains an open problem to estimate sequential computation time in each process accurately and efficiently for large-scale parallel applications on non-existing target machines. This paper proposes a novel approach to predict the sequential computation time accurately and efficiently. We assume that there is at least one node of the target platform but the whole target system need not be available. We make two main technical contributions. First, we employ deterministic replay techniques to execute any process of a parallel application on a single node at real speed. As a result, we can simply measure the real sequential computation time on a target node for each process one by one. Second, we observe that computation behavior of processes in parallel applications can be clustered into a few groups while processes in each group have similar computation behavior. This observation helps us reduce measurement time significantly because we only need to execute representative parallel processes instead of all of them. We have implemented a performance prediction framework, called PHANTOM,which integrates the above computation-time acquisition approach with a trace-driven network simulator. We validate our approach on several platforms. For ASCI Sweep3D, the error of our approach is less than 5% on 1024 processor cores. Compared to a recent regression-based prediction approach, PHANTOM presents better prediction accuracy across different platforms.
机译:对于大型并行计算机的设计人员,非常希望可以在设计阶段预测并行应用程序的性能。但是,这很困难,因为并行应用程序的执行时间由几个因素决定,包括每个进程中的顺序计算时间,通信时间及其卷积。尽管有先前的努力,对于不存在的目标机器上的大规模并行应用,准确而有效地估计每个过程中的顺序计算时间仍然是一个未解决的问题。本文提出了一种新的方法来准确有效地预测顺序计算时间。我们假定目标平台至少有一个节点,但是整个目标系统不需要可用。我们做出两项主要的技术贡献。首先,我们采用确定性重播技术以实际速度在单个节点上执行并行应用程序的任何进程。结果,我们可以简单地逐个测量每个进程在目标节点上的实际顺序计算时间。其次,我们观察到并行应用程序中的进程的计算行为可以分为几组,而每组中的进程具有相似的计算行为。此观察结果有助于我们显着减少测量时间,因为我们只需要执行代表性的并行过程即可,而不是全部执行。我们已经实现了一个称为PHANTOM的性能预测框架,该框架将上述计算时间获取方法与跟踪驱动的网络模拟器集成在一起。我们在多个平台上验证了我们的方法。对于ASCI Sweep3D,在1024个处理器内核上,我们的方法误差小于5%。与最近的基于回归的预测方法相比,PHANTOM在不同平台上的预测精度更高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号