首页>
外文OA文献
>Job Management Requirements for NAS Parallel Systems and Clusters
【2h】
Job Management Requirements for NAS Parallel Systems and Clusters
展开▼
机译:NAS并行系统和群集的作业管理要求
展开▼
免费
页面导航
摘要
著录项
引文网络
相似文献
相关主题
摘要
A job management system is a critical component of a production supercomputing environment, permitting oversubscribed resources to be shared fairly and efficiently. Job management systems that were originally designed for traditional vector supercomputers are not appropriate for the distributed-memory parallel supercomputers that are becoming increasingly important in the high performance computing industry. Newer job management systems offer new functionality but do not solve fundamental problems. We address some of the main issues in resource allocation and job scheduling we have encountered on two parallel computers - a 160-node IBM SP2 and a cluster of 20 high performance workstations located at the Numerical Aerodynamic Simulation facility. We describe the requirements for resource allocation and job management that are necessary to provide a production supercomputing environment on these machines, prioritizing according to difficulty and importance, and advocating a return to fundamental issues.
展开▼