Explicit Fourth-Order Runge-Kutta Method on Intel Xeon Phi Coprocessor

Beata Bylina; Joanna Potiopa

首页> 外文期刊>International journal of parallel programming >Explicit Fourth-Order Runge-Kutta Method on Intel Xeon Phi Coprocessor

【24h】

Explicit Fourth-Order Runge-Kutta Method on Intel Xeon Phi Coprocessor

机译：英特尔至强融核协处理器上的显式四阶Runge-Kutta方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper concerns an Intel Xeon Phi implementation of the explicit fourth-order Runge-Kutta method (RK4) for very sparse matrices with very short rows. Such matrices arise during Markovian modeling of computer and telecommunication networks. In this work an implementation based on Intel Math Kernel Library (Intel MKL) routines and the authors' own implementation, both using the CSR storage scheme and working on Intel Xeon Phi, were investigated. The implementation based on the Intel MKL library uses the high-performance BLAS and Sparse BLAS routines. In our application we focus on OpenMP style programming. We implement SpMV operation and vector addition using the basic optimizing techniques and the vectorization. We evaluate our approach in native and offload modes for various number of cores and thread allocation affinities. Both implementations (based on Intel MKL and made by the authors) were compared in respect of the time, the speedup and the performance. The numerical experiments on Intel Xeon Phi show that the performance of authors' implementation is very promising and gives a gain of up to two times compared to the multithreaded implementation (based on Intel MKL) running on CPU (Intel Xeon processor) and even three times in comparison with the application which uses Intel MKL on Intel Xeon Phi.

机译：本文涉及显式四阶Runge-Kutta方法（RK4）的Intel Xeon Phi实现，该方法适用于行数非常短的稀疏矩阵。这种矩阵是在计算机和电信网络的马尔可夫模型期间产生的。在这项工作中，研究了基于Intel Math Kernel Library（Intel MKL）例程的实现以及作者自己的实现，均使用CSR存储方案并在Intel Xeon Phi上工作。基于Intel MKL库的实现使用高性能BLAS和稀疏BLAS例程。在我们的应用程序中，我们专注于OpenMP风格的编程。我们使用基本的优化技术和向量化技术来实现SpMV运算和向量加法。我们针对各种数量的内核和线程分配亲和力，在本机模式和卸载模式下评估了我们的方法。比较了两种实现方式（基于英特尔MKL并由作者完成），在时间，加速和性能方面进行了比较。在Intel Xeon Phi上进行的数值实验表明，作者实现的性能非常有前途，与运行在CPU（Intel Xeon处理器）上的多线程实现（基于Intel MKL）相比，最多可获得两倍的收益。与在Intel Xeon Phi上使用Intel MKL的应用程序相比。

著录项

来源
《International journal of parallel programming》 |2017年第5期|1073-1090|共18页
作者
Beata Bylina; Joanna Potiopa;
展开▼
作者单位

Department of Computer Science, Maria Curie-Sklodowska University, Plac M. Curie-Sklodowskiej 1, 20-031 Lublin, Poland;

Department of Computer Science, Maria Curie-Sklodowska University, Plac M. Curie-Sklodowskiej 1, 20-031 Lublin, Poland;

展开▼
收录信息美国《科学引文索引》(SCI);美国《工程索引》(EI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Intel Xeon Phi; Fourth-order Runge-Kutta method; CSR format; Intel Math Kernel Library (Intel MKL); SpMV; OpenMP;

机译：英特尔至强融核;四阶Runge-Kutta方法;CSR格式;英特尔数学内核库（Intel MKL）;SpMV;OpenMP的;

相似文献

外文文献
中文文献
专利

1. Parallel BRDF-based infrared radiation simulation of aerial targets implemented on Intel Xeon processor and Xeon Phi coprocessor [J] . Guo Xing, Wu Zhensen, Wu Jiaji, Journal of Real-Time Image Processing . 2019,第1期

机译：在英特尔至强处理器和至强融核协处理器上实现的基于BRDF的空中目标的并行红外辐射仿真
2. Asynchronous and synchronous models of executions on Intel~® Xeon Phi~(TM) coprocessor systems for high performance of long wave radiation calculations in atmosphere models [J] . Amlesh Kashyap, Sathish S. Vadhiyar, Ravi S. Nanjundiah, Journal of Parallel and Distributed Computing . 2017,第Apra期

机译：Intel〜Xeon Phi〜（TM）协处理器系统的异步和同步模型，用于大气模型的长波辐射计算高性能
3. Similarity (range and kNN) queries processing on an Intel Xeon Phi coprocessor [J] . Toledo Carlos M., Barrientos Ricardo J., Avila Andres I. Cluster computing . 2016,第1期

机译：英特尔至强融核协处理器上的相似性（范围和kNN）查询处理
4. mAMBER: Accelerating Explicit Solvent Molecular Dynamic with Intel Xeon Phi Many-Integrated Core Coprocessors [C] . Xin Liu, Shaoliang Peng, Canqun Yang, IEEE/ACM international symposium on cluster, cloud and grid computing . 2015

机译：会员：利用英特尔至强融核众多集成核心协处理器加速显式溶剂分子动力学
5. Advancing LAMMPS Performance on Intel Xeon Phi Processors Coprocessors [D] . Vorsu, Sandeep Kumar. 2017

机译：在英特尔Xeon Phi处理器协处理器上推进LAMMPS性能
6. Efficient irregular wavefront propagation algorithms on Intel® Xeon Phi™ [O] . Jeremias M. Gomes, George Teodoro, Alba de Melo, -1

机译：英特尔®至强融核™上的高效不规则波前传播算法
7. Explicit Fourth-Order Runge–Kutta Method on Intel Xeon Phi Coprocessor [O] . Beata Bylina, Joanna Potiopa 2016

机译：英特尔至强融核协处理器上的显式四阶Runge–Kutta方法

Explicit Fourth-Order Runge-Kutta Method on Intel Xeon Phi Coprocessor

摘要

著录项

相似文献

相关主题

期刊订阅