首页> 外文期刊>EPJ Web of Conferences >Migrating Engineering Windows HPC applications to Linux HTCondor and Slurm Clusters
【24h】

Migrating Engineering Windows HPC applications to Linux HTCondor and Slurm Clusters

机译:将工程Windows HPC应用程序迁移到Linux Htcondor和Slurm集群

获取原文
           

摘要

The CERN IT department has been maintaining different High Performance Computing (HPC) services over the past five years. While the bulk of computing facilities at CERN are running under Linux, a Windows cluster was dedicated for engineering simulations and analysis related to accelerator technology development. The Windows cluster consisted of machines with powerful CPUs, big memory, and a low-latency interconnect. The Linux cluster resources are accessible through HTCondor, and are used for general purpose parallel but single-node type jobs, providing computing power to the CERN experiments and departments for tasks such as physics event reconstruction, data analysis, and simulation. For HPC workloads that require multi-node parallel environments for Message Passing Interface (MPI) based programs, there is another Linux-based HPC service that is comprised of several clusters running under the Slurm batch system, and consist of powerful hardware with low-latency interconnects.In 2018, it was decided to consolidate compute intensive jobs in Linux to make a better use of the existing resources. Moreover, this was also in line with CERN IT strategy to reduce its dependencies on Microsoft products. This paper focuses on the migration of Ansys [1], COMSOL [2] and CST [3] users from Windows HPC to Linux clusters. Ansys, COMSOL and CST are three engineering applications used at CERN for different domains, like multiphysics simulations and electromagnetic field problems. Users of these applications are in different departments, with different needs and levels of expertise. In most cases, the users have no prior knowledge of Linux. The paper will present the technical strategy to allow the engineering users to submit their simulations to the appropriate Linux cluster, depending on their simulation requirements. We also describe the technical solution to integrate their Windows workstations in order from them to be able to submit to Linux clusters. Finally, we discuss the challenges and lessons learnt during the migration.
机译:CERN IT部门在过去五年中一直维持不同的高性能计算(HPC)服务。虽然CERN的大部分计算设施在Linux下运行,但Windows群集专用于与加速技术开发相关的工程模拟和分析。 Windows群集由具有强大CPU,大存储器和低延迟互连的机器组成。 Linux群集资源可通过HTCondor访问,并且用于通用并行但单节点类型作业,为CERS实验和部门提供计算能力,例如物理事件重建,数据分析和仿真。对于需要基于消息传递接口(MPI)的消息的HPC工作负载,还有另一个基于Linux的HPC服务,该服务包括在SLURM批处理系统下运行的多个集群,并且由具有低延迟的强大硬件组成Interconnects.in 2018,决定在Linux中巩固计算密集型工作,以更好地利用现有资源。此外,这也符合CERN IT战略,以减少对Microsoft产品的依赖性。本文重点介绍了ANSYS [1],COMSOL [2]和CST [3]用户从Windows HPC到Linux集群的迁移。 ANSYS,COMSOL和CST是用于不同域的CERN的三种工程应用,如多际模拟和电磁场问题。这些应用程序的用户在不同的部门,具有不同的需求和专业级别。在大多数情况下,用户没有先验的Linux知识。本文将介绍技术策略,以允许工程用户将其模拟提交给适当的Linux集群,具体取决于其仿真要求。我们还描述了将其Windows工作站集成的技术方案,以便能够从它们提交给Linux集群。最后,我们讨论了在迁移期间汲取的挑战和经验教训。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号