Performance Framework for HPC Applications on Homogeneous Computing Platform

Chandrashekhar B. N; Sanjay H. A

首页> 外文期刊>International Journal of Image, Graphics and Signal Processing >Performance Framework for HPC Applications on Homogeneous Computing Platform

【24h】

Performance Framework for HPC Applications on Homogeneous Computing Platform

机译：异构计算平台上HPC应用程序的性能框架

获取原文

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In scientific fields, solving large and complex computational problems using central processing units (CPU) alone is not enough to meet the computation requirement. In this work we have considered a homogenous cluster in which each nodes consists of same capability of CPU and graphical processing unit (GPU). Normally CPU are used for control GPU and to transfer data from CPU to GPUs. Here we are considering CPU computation power with GPU to compute high performance computing (HPC) applications. The framework adopts pinned memory technique to overcome the overhead of data transfer between CPU and GPU. To enable the homogeneous platform we have considered hybrid [message passing interface (MPI), OpenMP (open multi-processing), Compute Unified Device Architecture (CUDA)] programming model strategy. The key challenge on the homogeneous platform is allocation of workload among CPU and GPU cores. To address this challenge we have proposed a novel analytical workload division strategy to predict an effective workload division between the CPU and GPU. We have observed that using our hybrid programming model and workload division strategy, an average performance improvement of 76.06% and 84.11% in Giga floating point operations per seconds(GFLOPs) on NVIDIA TESLA M2075 cluster and NVIDIA QUADRO K 2000 nodes of a cluster respectively for N-dynamic vector addition when compared with Simplice Donfack et.al [5] performance models. Also using pinned memory technique with hybrid programming model an average performance improvement of 33.83% and 39.00% on NVIDIA TESLA M2075 and NVIDIA QUADRO K 2000 respectively is observed for saxpy applications when compared with pagable memory technique.

机译：在科学领域，仅使用中央处理器（CPU）解决大型和复杂的计算问题还不足以满足计算要求。在这项工作中，我们考虑了同质集群，其中每个节点都具有相同的CPU和图形处理单元（GPU）功能。通常，CPU用于控制GPU并将数据从CPU传输到GPU。在这里，我们考虑使用GPU的CPU计算能力来计算高性能计算（HPC）应用程序。该框架采用固定内存技术来克服CPU和GPU之间的数据传输开销。为了启用同类平台，我们考虑了混合[消息传递接口（MPI），OpenMP（开放式多处理），计算统一设备体系结构（CUDA）]编程模型策略。同类平台上的主要挑战是在CPU和GPU内核之间分配工作负载。为了应对这一挑战，我们提出了一种新颖的分析工作负载分配策略，以预测CPU和GPU之间的有效工作负载分配。我们已经观察到，使用我们的混合编程模型和工作负载划分策略，分别在NVIDIA TESLA M2075群集和NVIDIA QUADRO K 2000节点上，每秒Giga浮点操作（GFLOP）的平均性能分别提高了76.06％和84.11％。与Simplice Donfack等[5]性能模型相比，N动态矢量加法。与固定内存技术相比，对于saxpy应用程序，还发现在混合编程模型中使用固定内存技术的NVIDIA TESLA M2075和NVIDIA QUADRO K 2000的平均性能分别提高了33.83％和39.00％。

著录项

来源
《International Journal of Image, Graphics and Signal Processing》 |2019年第8期|共12页
作者
Chandrashekhar B. N; Sanjay H. A;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类 TP391.41;
关键词
Central Processing Unit(CPU)Compute Unified Device Architecture (CUDA)Graphics processing units (GPUs)High Performance computing(HPC)Message passing Interface (MPI)Giga Floating Point Operations Per seconds(GFLOPs);

机译：中央处理器（CPU）计算统一设备架构（CUDA）图形处理器（GPU）高性能计算（HPC）消息传递接口（MPI）千兆位每秒浮点运算（GFLOP）;

相似文献

外文文献
中文文献
专利

1. CLOUDRB: A framework for scheduling and managing High-Performance Computing (HPC) applications in science cloud [J] . Thamarai Selvi Somasundaram, Kannan Govindarajan Future generation computer systems . 2014,第may期

机译：CLOUDRB：用于在科学云中调度和管理高性能计算（HPC）应用程序的框架
2. BOAST: A metaprogramming framework to produce portable and efficient computing kernels for HPC applications [J] . Videau Brice, Pouget Kevin, Genovese Luigi, Experimental Mechanics . 2018,第1期

机译：BOAST：元编程框架，可为HPC应用程序生成可移植且高效的计算内核
3. Let's HPC: A web-based platform to aid parallel, distributed and high performance computing education [J] . Bhaskar Chaudhury, Akshar Varma, Yashwant Keswani, Journal of Parallel and Distributed Computing . 2018,第PTa1期

机译：让我们来HPC：一个基于Web的平台，可协助并行，分布式和高性能计算教学
4. A distributed cloud resource management framework for High-Performance Computing (HPC) applications [C] . Kannan Govindarajan, Vivekanandan Suresh Kumar, Thamarai Selvi Somasundaram International Conference on Advanced Computing . 2017

机译：用于高性能计算（HPC）应用程序的分布式云资源管理框架
5. A framework for performance tuning and analysis on parallel computing platforms. [D] . Gehrke, Allison S. 2015

机译：在并行计算平台上进行性能调整和分析的框架。
6. P43-S Computational Biology Applications Suite for High-Performance Computing (BioHPC.net) [O] . J. Pillardy 2007

机译：适用于高性能计算的P43-S计算生物学应用套件（BioHPC.net）
7. Performance Framework for HPC Applications on Homogeneous Computing Platform [O] . Chandrashekhar B. N, Sanjay H. A 2019

机译：同质计算平台上HPC应用的性能框架

Performance Framework for HPC Applications on Homogeneous Computing Platform

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅