首页> 外文会议>High-performance computing and networking >HPF on intel Paragon and CRAFT on CRAY T3D: Basic performance measurements and experiments with a block-sparse CG-algorithm
【24h】

HPF on intel Paragon and CRAFT on CRAY T3D: Basic performance measurements and experiments with a block-sparse CG-algorithm

机译:intel Paragon上的HPF和CRAY T3D上的CRAFT:基本性能测量和使用稀疏CG算法的实验

获取原文
获取原文并翻译 | 示例

摘要

After the proposal for HPF had been finalized, it took only a very short time until the first compilers were visible at the surface; mainly those from APR and PGI. The parallel programming-model of HPF covers the partitioning of data-arrays at compile-time and data-and DO-loop-distribution onto PEs according to the owner sets rule. Non-distributed data-arrays and scalars are replicated. Efficiency can be achieved through a good work- and data-partitioning onto PEs. CRAFT is the CRay Adaptive ForTran of Cray Research. In addition to the shared variable concept of HPF, CRAFT allows also 'private' variables. Therefore CRAFT can be mixed with message passing and explicit shared memory functions and can perform shared to private coercion. Also, it is possible to share work on subroutine level, define sequential regions and explicit synchronization-points.rnThis paper presents experiences and results with the APR and PGI HPF-Compilers on the intel Paragon and CRAFT on the CRAY T3D. Motivated by the wide use of unstructured discretizations in CFD and structural mechanics, we examine the parallelization of a block-sparse Conjugate Gradient (CG-) algorithm. An overview over the adapted BCCS-format and the corresponding data-distribution is given. We describe the difficulties, restrictions and results of using this storage format for efficiently calculating the sparse matrix-vector-product, which is the dominating operation in the CG-algorithm.
机译:在HPF提案最终确定之后,只花了很短的时间就可以看到第一批编译器。主要来自APR和PGI。 HPF的并行编程模型涵盖了在编译时对数据阵列的分区,以及根据所有者集规则将数据和DO循环分布到PE上的过程。复制非分布式数据数组和标量。通过在PE上进行良好的工作和数据分区,可以实现效率。 CRAFT是Cray Research的CRay自适应ForTran。除了HPF的共享变量概念外,CRAFT还允许使用“私有”变量。因此,CRAFT可以与消息传递和显式共享内存功能混合使用,并且可以执行对私有强制的共享。此外,还可以在子例程级别共享工作,定义顺序区域和显式同步点。本文介绍了intel Paragon上的APR和PGI HPF编译器以及CRAY T3D上的CRAFT的经验和结果。出于在CFD和结构力学中广泛使用非结构化离散化的动机,我们研究了块稀疏共轭梯度(CG-)算法的并行化。给出了适用的BCCS格式和相应数据分布的概述。我们描述了使用这种存储格式来有效地计算稀疏矩阵向量乘积的困难,限制和结果,这是CG算法中的主要操作。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号