首页> 外文期刊>Parallel Computing >Resilient computational applications using Coarray Fortran
【24h】

Resilient computational applications using Coarray Fortran

机译:使用Coarray Fortran的弹性计算应用程序

获取原文
获取原文并翻译 | 示例

摘要

With the increase in the number of hardware components and layers of the software stack in High Performance Computing (HPC) there will likely be an increment in number of hardware and software failures, which will be user-visible. Even under the most optimistic assumptions about the individual components reliability, probabilistic amplification from using millions of nodes has a dramatic impact on the Mean Time Between Failure (MTBF) of the entire platform.Although several techniques to address this problem have been developed, the support provided by the programming model, for the user to mitigate or work around this issue, is still insufficient. The Fortran 2018 standard defines failed images, a new feature that allows the programmer to detect and manage image failures in a parallel program.In this paper we show how to use failed images and teams, another feature defined in the Fortran 2018 standard, to implement resilient computational applications. (C) 2018 Elsevier B.V. All rights reserved.
机译:随着高性能计算(HPC)中硬件组件和软件堆栈层的数量增加,硬件和软件故障的数量可能会增加,这对于用户是可见的。即使在最乐观的单个组件可靠性假设下,使用数百万个节点进行概率放大也会对整个平台的平均故障间隔时间(MTBF)产生巨大影响。尽管已开发出多种解决此问题的技术,编程模型提供的用于用户减轻或解决此问题的功能仍然不足。 Fortran 2018标准定义了失败的映像,这是一个新功能,允许程序员在并行程序中检测和管理映像故障。本文展示了如何使用失败的映像和团队(Fortran 2018标准中定义的另一个功能)来实现弹性计算应用程序。 (C)2018 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号