...
首页> 外文期刊>International Journal of Parallel Programming >The Bottom-Up Implementation of One MILC Lattice QCD Application on the Cell Blade
【24h】

The Bottom-Up Implementation of One MILC Lattice QCD Application on the Cell Blade

机译:在单元刀片上自底向上实现一个MILC晶格QCD应用程序

获取原文
获取原文并翻译 | 示例

摘要

We report the results of the bottom-up implementation of one MILC lattice quantum chromodynamics (QCD) application on the Cell Broadband Engine™ processor. In our implementation, we preserve MILC’s framework for scaling the application to run on a large number of compute nodes and accelerate computationally intensive kernels on the Cell’s synergistic processor elements. Speedups of 3.4 × for the 8 × 8 × 16 × 16 lattice and 5.7 × for the 16 × 16 × 16 × 16 lattice are obtained when comparing our implementation of the MILC application executed on a 3.2 GHz Cell processor to the standard MILC code executed on a quad-core 2.33 GHz Intel Xeon processor. We provide an empirical model to predict application performance for a given lattice size. We also show that performance of the compute-intensive part of the application on the Cell processor is limited by the bandwidth between main memory and the Cell’s synergistic processor elements, whereas performance of the application’s parallel execution framework is limited by the bandwidth between main memory and the Cell’s power processor element.
机译:我们报告了在Cell Broadband Engine™处理器上自下而上实现一个MILC晶格量子色动力学(QCD)应用程序的结果。在我们的实施中,我们保留了MILC的框架,该框架用于扩展应用程序以在大量计算节点上运行,并在Cell的协同处理器元件上加速计算密集型内核。将我们在3.2 GHz Cell处理器上执行的MILC应用程序的执行与执行的标准MILC代码进行比较,可以获得8×8×16×16格的3.4×加速和16×16×16×16格的5.7×加速。在四核2.33 GHz Intel Xeon处理器上。我们提供了一个经验模型来预测给定晶格大小的应用程序性能。我们还表明,Cell处理器上应用程序的计算密集型部分的性能受到主内存和Cell协同处理器元素之间的带宽的限制,而应用程序的并行执行框架的性能则受主内存和Cell之间的带宽的限制。单元的电源处理器元件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号