New Performance Modeling Methods for Parallel Data Processing Applications

Bhimani Janki; Mi Ningfang; Leeser Miriam; Yang Zhengyu

首页> 外文期刊>ACM Transactions on Modeling and Computer Simulation >New Performance Modeling Methods for Parallel Data Processing Applications

【24h】

New Performance Modeling Methods for Parallel Data Processing Applications

机译：并行数据处理应用程序的新性能建模方法

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Predicting the performance of an application running on parallel computing platforms is increasingly becoming important because of its influence on development time and resource management. However, predicting the performance with respect to parallel processes is complex for iterative and multi-stage applications. This research proposes a performance approximation approach FiM to predict the calculation time with FiM-Cal and communication time with FiM-Com of an application running on a distributed framework. FiM-Cal consists of two key components that are coupled with each other: (1) a Stochastic. Markov Model to capture non-deterministic runtime that often depends on parallel resources, e.g., number of processes, and (2) a machine-learning model that extrapolates the parameters for calibrating our Markov model when we have changes in application parameters such as dataset. Along with the parallel calculation time, parallel computing platforms consume some data transfer time to communicate among different nodes. FiM-Com consists of a simulation queuing model to quickly estimate communication time. Our new modeling approach considers different design choices along multiple dimensions, namely (i) process-level parallelism, (ii) distribution of cores on multi-processor platform, (iii) application related parameters, and (iv) characteristics of datasets. The major contribution of our prediction approach is that FiM can provide an accurate prediction of parallel processing time for the datasets that have a much larger size than that of the training datasets. We evaluate our approach with NAS Parallel Benchmarks and real iterative data processing applications. We compare the predicted results (e.g., end-to-end execution time) with actual experimental measurements on a real distributed platform. We also compare our work with an existing prediction technique based on machine learning. We rank the number of processes according to the actual and predicted results from FLM and calculate the correlation between the actual and predicted rankings. Our results show that FiM obtains a high correlation in the range of 0.80 to 0.99, which indicates considerable accuracy of our technique. Such prediction provides data analysts a useful insight of optimal configuration of parallel resources (e.g., number of processes and number of cores) and also helps system designers to investigate the impact of changes in application parameters on system performance.

机译：预测在并行计算平台上运行的应用程序的性能变得越来越重要，因为它会影响开发时间和资源管理。但是，对于迭代和多阶段应用程序，相对于并行过程预测性能很复杂。这项研究提出了一种性能近似方法FiM来预测在分布式框架上运行的应用程序的FiM-Cal计算时间和与FiM-Com的通信时间。 FiM-Cal由两个相互联系的关键组成部分组成：（1）随机指标。马尔可夫模型以捕获通常取决于并行资源（例如进程数）的非确定性运行时，以及（2）机器学习模型，当我们对应用程序参数（例如数据集）进行更改时，该模型会外推参数以校准我们的马尔可夫模型。随着并行计算时间的增加，并行计算平台会花费一些数据传输时间来在不同节点之间进行通信。 FiM-Com包含一个仿真排队模型，可快速估算通信时间。我们的新建模方法在多个维度上考虑了不同的设计选择，即（i）进程级并行性，（ii）多处理器平台上的内核分布，（iii）与应用程序相关的参数以及（iv）数据集的特征。我们的预测方法的主要贡献在于，FiM可以为尺寸比训练数据集大得多的数据集提供并行处理时间的准确预测。我们使用NAS并行基准和实际的迭代数据处理应用程序评估我们的方法。我们将预测结果（例如端到端执行时间）与真实分布式平台上的实际实验测量值进行比较。我们还将我们的工作与基于机器学习的现有预测技术进行比较。我们根据FLM的实际和预测结果对进程数进行排名，并计算实际和预测排名之间的相关性。我们的结果表明，FiM在0.80至0.99的范围内获得了很高的相关性，这表明我们的技术具有相当大的准确性。这种预测为数据分析人员提供了有关并行资源最佳配置（例如，进程数和内核数）的有用见解，还有助于系统设计人员研究应用程序参数变化对系统性能的影响。

著录项

来源
《ACM Transactions on Modeling and Computer Simulation》 |2019年第3期|15.1-15.24|共24页
作者
Bhimani Janki; Mi Ningfang; Leeser Miriam; Yang Zhengyu;
展开▼
作者单位

Northeastern Univ 360 Huntington Ave Boston MA 02115 USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Performance modeling; queuing theory; Markov model; distributed systems; execution time; parallel calculation; communication network; prediction;

机译：性能建模;排队论马尔可夫模型分布式系统;执行时间处理时间;并行计算;通讯网络;预测;

相似文献

外文文献
中文文献
专利

1. New Model-Based Methods and Algorithms for Performance and Energy Optimization of Data Parallel Applications on Homogeneous Multicore Clusters [J] . Alexey Lastovetsky, Ravi Reddy Manumachu IEEE Transactions on Parallel and Distributed Systems . 2017,第4期

机译：均质多核集群上数据并行应用程序性能和能量优化的基于模型的新方法和算法
2. Parallel cartographic modeling: a methodology for parallelizing spatial data processing [J] . Shook Eric, Hodgson Michael E., Wang Shaowen, International Journal of Geographical Information Science . 2016,第11a12期

机译：平行制图建模：一种并行处理空间数据处理的方法
3. Parallel performance modeling of irregular applications in cell-centered finite volume methods over unstructured tetrahedral meshes [J] . J. Langguth, N. Wu, J. Chai, Journal of Parallel and Distributed Computing . 2015,第feba期

机译：非结构四面体网格上以单元为中心的有限体积方法中不规则应用的并行性能建模
4. Review of Parallel Processing Methods for Big Image Data Applications [C] . K. Vigneshwari, K. Kalaiselvi International conference on communication, computing and electronics systems . 2020

机译：对大图像数据应用的并行处理方法述评
5. High Performance Soft Processor Architectures for Applications with Irregular Data- and Instruction-Level Parallelism [D] . Aasaraai, Kaveh 2014

机译：具有不规则数据和指令级并行性的应用的高性能软处理器架构
6. Data Processing Methods for 3D Seismic Imaging of Subsurface Volcanoes: Applications to the Tarim Flood Basalt [O] . Lei Wang, Wei Tian, Yongmin Shi 2017

机译：地下火山3D地震成像的数据处理方法：在塔里木洪水玄武岩中的应用
7. RACER data stream based array processor and algorithm implementation methods as well as their applications for parallel, heterogeneous computing architectures [O] . Rák Ádám 2014

机译：基于RaCER数据流的阵列处理器和算法实现方法以及它们用于并行，异构计算架构的应用程序
8. Distributed Computing for Signal Processing: Modeling of Asynchronous Parallel Computation. Appendix C. Fault Tolerant Interconnection Networks and Image Processing Applications for the PASM Parallel Processing Systems [R] . Adams, G. B. 1984

机译：信号处理的分布式计算：异步并行计算的建模。附录C. pasm并行处理系统的容错互连网络和图像处理应用

New Performance Modeling Methods for Parallel Data Processing Applications

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅