首页> 外文期刊>Cybernetics and information technologies: CIT >Performance Analysis of a Scalable Algorithm for 3D Linear Transforms on Supercomputer with Intel Processors/Co-Processors
【24h】

Performance Analysis of a Scalable Algorithm for 3D Linear Transforms on Supercomputer with Intel Processors/Co-Processors

机译:Intel处理器/协处理器超级计算机上3D线性变换可扩展算法的性能分析

获取原文
           

摘要

Practical realizations of 3D forward/inverse separable discretetransforms, such as Fourier transform, cosine/sine transform, etc. are frequently theprincipal limiters that prevent many practical applications from scaling to a largenumber of processors. Existing approaches, which are based primarily on 1D or 2Ddata decompositions, prevent the 3D transforms from effectively scaling to themaximum (possible/available) number of computer nodes. A highly scalableapproach to realize forward/inverse 3D transforms has been proposed. It is based ona 3D decomposition of data and geared towards a torus network of computer nodes.The proposed algorithms requires compute-and-roll time-steps, where each stepconsists of an execution of multiple GEMM operations and concurrent movement ofcubical data blocks between nearest neighbors. The aim of this paper is to present anexperimental performance study of an implementation on high performancecomputer architecture.
机译:3D向前/逆可分离离散进度的实际实现,例如傅立叶变换,余弦/正弦变换等是普遍限制器,可防止许多实际应用从扩大到处理器的Largenumber。现有方法主要基于1D或2DDATA分解,防止3D变换有效地缩放到主题(可能/可用)的计算机节点数量。已经提出了一种高度ScalableApproach实现前向/逆3D变换。它是基于数据的3D分解,并朝向计算机节点的圆环网络。所提出的算法需要计算和滚动时间步骤,其中每个提升者在最近邻居之间执行多个GEMM操作的执行和COMBICE数据块的并发移动。本文的目的是提出对高位验证计算机架构实施的单实验性能研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号