首页> 外文期刊>International Journal of Computer Systems Science & Engineering >Performance analysis and tuning for clusters with ccNUMA nodes for scientific computing - a case study
【24h】

Performance analysis and tuning for clusters with ccNUMA nodes for scientific computing - a case study

机译:具有ccNUMA节点的集群的性能分析和优化以进行科学计算-案例研究

获取原文
获取原文并翻译 | 示例
       

摘要

In the quest for higher performance and with the increasing availability of multi-core chips, many systems are currently packing more processors per node. Adopting a ccNUMA node architecture in these cases has the promise of achieving a balance between cost and performance. In this paper, a 2312 Opteron cores system based on Sun fire servers is considered as a case study to examine the performance issues associated with such architectures. In this study, we characterize the performance behavior of the system with locus on the node level using ditlerent configurations. It will be shown that the benefits Irom larger nodes can be severely limited for many reasons. These reasons were isolated, the associated performance losses were assessed, and some potential solutions were proposed. With the proposed performance tunings, up to 30% application performance improvement was observed. The results revealed that such problems were mainly caused by topological imbalances, limitations of the cache coherence protocol used, operating system services distribution and the lack of intelligent management of memory affinity. In addition, provided experimental analysis cam be utilized by HPC application developers in order to better understand clusters with ccNUMA nodes and also as a guideline for the use of such architectures for scientific computing.
机译:为了寻求更高的性能以及随着多核芯片可用性的提高,许多系统目前正在每个节点上包装更多的处理器。在这些情况下,采用ccNUMA节点体系结构有望在成本和性能之间取得平衡。本文以基于Sun Fire服务器的2312 Opteron核心系统为案例研究,以研究与此类架构相关的性能问题。在这项研究中,我们使用不同的配置来描述系统在节点级别上的性能行为。将会显示,由于许多原因,可能会严重限制Irom较大节点的收益。隔离了这些原因,评估了相关的性能损失,并提出了一些可能的解决方案。通过提议的性能调整,可以观察到高达30%的应用程序性能提高。结果表明,此类问题主要是由于拓扑结构不平衡,所使用的缓存一致性协议的局限性,操作系统服务的分布以及缺乏对内存亲和性的智能管理所引起的。此外,HPC应用程序开发人员可以利用提供的实验分析功能,以便更好地了解具有ccNUMA节点的群集,并作为使用此类体系结构进行科学计算的指南。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号