...
首页> 外文期刊>Journal of Computational Physics >Towards large-scale multi-socket, multicore parallel simulations: Performance of an MPI-only semiconductor device simulator
【24h】

Towards large-scale multi-socket, multicore parallel simulations: Performance of an MPI-only semiconductor device simulator

机译:迈向大规模多插槽,多核并行仿真:仅MPI的半导体器件仿真器的性能

获取原文
获取原文并翻译 | 示例
           

摘要

This preliminary study considers the scaling and performance of a finite element (FE) semiconductor device simulator on a set of multi-socket, multicore architectures with nonuniform memory access (NUMA) compute nodes. These multicore architectures include two linux clusters with multicore processors: a quad-socket, quad-core AMD Opteron platform and a dual-socket, quad-core Intel Xeon Nehalem platform; and a dual-socket, six-core AMD Opteron workstation. These platforms have complex memory hierarchies that include local core-based cache, local socket-based memory, access to memory on the same mainboard from another socket, and then memory across network links to different nodes. The specific semiconductor device simulator used in this study employs a fully-coupled Newton-Krylov solver with domain decomposition and multilevel preconditioners. Scaling results presented include a large-scale problem of 100+ million unknowns on 4096 cores and a comparison with the Cray XT3/4 Red Storm capability platform. Although the MPI-only device simulator employed for this work can take advantage of all the cores of quad-core and six-core CPUs, the efficiency of the linear system solve is decreasing with increased core count and eventually a different programming paradigm will be needed.
机译:这项初步研究考虑了一组具有非均匀内存访问(NUMA)计算节点的多插槽,多核体系结构上的有限元(FE)半导体器件仿真器的可扩展性和性能。这些多核体系结构包括两个带有多核处理器的Linux集群:一个四路,四核AMD Opteron平台和一个双路,四核Intel Xeon Nehalem平台。以及双插槽六核AMD Opteron工作站。这些平台具有复杂的内存层次结构,包括基于本地核心的缓存,基于本地套接字的内存,从另一个插槽访问同一主板上的内存,然后跨网络链接到不同节点的内存。本研究中使用的特定半导体器件仿真器采用带域分解和多级预处理器的全耦合牛顿-克里洛夫求解器。提出的扩展结果包括4096个内核上的100+百万个未知数的大规模问题,以及与Cray XT3 / 4 Red Storm功能平台的比较。尽管用于此工作的仅MPI设备仿真器可以利用四核和六核CPU的所有内核,但是线性系统解决方案的效率随着内核数量的增加而降低,最终将需要不同的编程范例。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号