首页> 外文期刊>Computer architecture news >An efficient CELL library for Lattice Quantum Chromodynamics
【24h】

An efficient CELL library for Lattice Quantum Chromodynamics

机译:用于晶格量子色动力学的高效CELL库

获取原文
获取原文并翻译 | 示例

摘要

Quantum chromodynamics (QCD) is the theory of sub-nuclear physics, aiming at modeling the strong nuclear force, which is responsible for the interactions of nuclear particles. Numerical QCD studies are performed through a discrete formalism called LQCD (Lattice Quantum Chromodynamics). Typical simulations involve very large volume of data and numerically sensitive entities, thus the crucial need of high performance computing systems. We propose a set of CELL-accelerated routines for basic LQCD calculations. Our framework is provided as a unified library and is particularly optimized for an iterative use. Each routine is parallelized among the SPUs, and each SPU achieves it task by looping on small chunk of arrays from the main memory. Our SPU implementation is vectorized with double precision data, and the cooperation with the PPU shows a good overlap between data transfers and computations. Moreover, we permanently keep the SPU context and use mailboxes to synchronize between consecutive calls. We validate our library by using it to derive a CELL version of an existing LQCD package (tmLQCD). Experimental results on individual routines show a significant speedup compare to standard processor, 11 times better than a 2.83 GHz INTEL processor for instance (without SSE). This ratio is around 9 (with QS22 blade) when consider a more cooperative context like solving a linear system of equations (usually referred as Wislon-Dirac inversion). Our results clearly demonstrate that the CELL is a very promising way for high-scale LQCD simulations.
机译:量子色动力学(QCD)是亚核物理学的理论,旨在对负责核粒子相互作用的强大核力进行建模。 QCD数值研究是通过一种离散的形式主义(称为LQCD(晶格量子色动力学))进行的。典型的仿真涉及大量数据和数字敏感实体,因此对高性能计算系统至关重要。我们提出了一组用于基本LQCD计算的CELL加速例程。我们的框架是作为一个统一的库提供的,并且针对迭代使用进行了特别优化。每个例程在SPU之间并行化,并且每个SPU通过循环访问主内存中的一小部分阵列来完成任务。我们的SPU实现是使用双精度数据进行矢量化的,并且与PPU的协作显示出数据传输和计算之间的良好重叠。此外,我们永久保留SPU上下文,并使用邮箱在连续的呼叫之间进行同步。我们通过使用库来推导现有LQCD软件包(tmLQCD)的CELL版本来验证我们的库。单个例程的实验结果表明,与标准处理器相比,其速度显着提高,例如,比2.83 GHz INTEL处理器(不带SSE)高11倍。考虑更合作的环境(例如,求解线性方程组)(通常称为Wislon-Dirac反演)时,该比率约为9(使用QS22刀片)。我们的结果清楚地表明,CELL是用于大规模LQCD模拟的非常有前途的方法。

著录项

  • 来源
    《Computer architecture news》 |2010年第4期|p.60-65|共6页
  • 作者单位

    Linear Accelerator Laboratory/CNRS/IN2P3 University of Orsay, Faculty of Sciences, Bat. 200 91898 Orsay Cedex (France);

    Linear Accelerator Laboratory/CNRS/IN2P3 University of Orsay, Faculty of Sciences, Bat. 200 91898 Orsay Cedex (France);

    Theoretical Physics Laboratory University of Orsay, Faculty of Sciences, Bat. 210 91405 Orsay Cedex (France);

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    LQCD; linear algebra; parallelism; CELL;

    机译:LQCD;线性代数并行性细胞;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号