首页> 外文会议>ACM/IEEE conference on Supercomputing >Optimization of a parallel ocean general circulation model
【24h】

Optimization of a parallel ocean general circulation model

机译:海洋平行总环流模型的优化

获取原文

摘要

Global climate modeling is one of the grand challenges of computational science, and ocean modeling plays an important role in both understanding the current climatic conditions and predicting the future climate change. Three-dimensional time-dependent ocean general circulation models (OGCMs) require a large amount of memory and processing time to run realistic simulations. Recent advances in computing hardware have dramatically affected the prospect of studying the global climate. The significant computational resources of massively parallel supercomputers promise to make such studies feasible. In addition to using advanced hardware, designing and implementing a well-optimized parallel ocean code will significantly improve the computational performance and reduce the total research time to complete these studies.In our present work, we chose the most widely used OGCM code as our base code. This OGCM is based on the Parallel Ocean Program (POP) developed in FORTRAN 90 on the Los Alamos CM-2 Connection Machine by the Los Alamos ocean modeling research group. During the first half of 1994, the code was ported to the Cray T3D by Cray Research using SHMEM-based message passing. Since the code on the T3D was still time-consuming when large problems were encountered, improving the code performance was considered essential.We have developed several general strategies to optimize the ocean general circulation model on the Cray T3D. These strategies include memory optimization, effective use of arithmetic pipelines, and usage of optimized libraries. The optimized code runs 2 to 2.5 times faster than the original code, which gives significant performance improvements for modeling large scaled ocean flows. Many test runs for both of the original and the optimized code have been carried out on the Cray T3D using various numbers of processors (1-256). Comparisons are made for a variety of real-world problems. A nearly linear scaling performance line is obtained for the optimized code, while the speed up data of the optimized code also shows excellent improvement over the original code.In addition to discussing the optimization of the code, we also address the issue of portability. Given the short life cycle of the massively parallel computer, usually on the order of three to five years, we emphasize the portability of the ocean model and the associated optimization routines across several computing platforms. Currently, the ocean modeling code has been ported successfully to the Hewlett Packard (HP)/Convex SPP-2000, and is readily portable to Cray T3E.This paper reports our efforts to optimize the parallel implementations of the oceanic model. So far, the work has focused on improving the load balancing and single node performance of the code on the Cray T3D. As a result, the atmosphere and ocean model components running side-by-side can achieve a performance level of slightly more than 10 GFLOPS on 512 processors of that machine. We have also developed a user-friendly coupling interface with atmospheric and biogeochemical models, in order to make the global climate modeling more complete and more realistic.
机译:全球气候建模是计算科学的重大挑战之一,海洋建模在理解当前气候条件和预测未来气候变化方面都发挥着重要作用。三维时间相关的海洋总环流模型(OGCM)需要大量内存和处理时间才能运行逼真的模拟。计算硬件的最新进展极大地影响了研究全球气候的前景。大规模并行超级计算机的大量计算资源有望使此类研究可行。除了使用先进的硬件之外,设计和实施经过优化的并行海洋代码还将显着提高计算性能并减少完成这些研究所需的总研究时间。在我们目前的工作中,我们选择了使用最广泛的OGCM代码作为基础代码。此OGCM基于Los Alamos海洋模型研究小组在FORSRAN 90上在Los Alamos CM-2连接机上开发的平行海洋程序(POP)。在1994年上半年,Cray Research使用基于SHMEM的消息传递将代码移植到Cray T3D。由于在遇到大问题时T3D上的代码仍然很耗时,因此提高代码性能被认为是必不可少的。我们已经开发了几种通用策略来优化Cray T3D上的海洋通用环流模型。这些策略包括内存优化,算术管道的有效使用以及优化库的使用。优化后的代码运行速度比原始代码快2到2.5倍,从而为大型海洋流建模提供了显着的性能提升。在Cray T3D上,使用不同数量的处理器(1-256)对原始代码和优化代码进行了许多测试。针对各种现实问题进行了比较。优化后的代码获得了接近线性的缩放性能线,而优化后的代码的加速数据也显示了对原始代码的出色改进。除了讨论代码的优化之外,我们还解决了可移植性问题。鉴于大型并行计算机的生命周期较短(通常为三到五年),我们强调了海洋模型的可移植性以及跨多个计算平台的相关优化例程。目前,海洋建模代码已成功移植到Hewlett Packard(HP)/ Convex SPP-2000,并且可轻松移植到Cray T3E。本文报告了我们为优化海洋模型的并行实现所做的努力。到目前为止,工作集中在改善Cray T3D上代码的负载平衡和单节点性能上。结果,并行运行的大气和海洋模型组件在该计算机的512个处理器上可以达到略高于10 GFLOPS的性能水平。我们还开发了具有大气和生物地球化学模型的用户友好耦合界面,以使全球气候模型更完整,更现实。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号