首页> 外文学位 >Network and CPU co-allocation in high throughput computing environments.
【24h】

Network and CPU co-allocation in high throughput computing environments.

机译:高吞吐量计算环境中的网络和CPU协同分配。

获取原文
获取原文并翻译 | 示例

摘要

A High Throughput Computing (HTC) environment delivers large amounts of computing capacity to its users over long periods of time by pooling available computing resources on the network. The HTC environment strives to provide useful computing services to its customers while respecting the various resource usage policies set by the many different owners and administrators of the computing resources. This requires a flexible scheduling mechanism, to match jobs with compatible computing resources, according to the job's needs and the attributes and policies of the available resources. It also requires mechanisms to help jobs to be more agile, so they can successfully compute on the resources currently available to them. A checkpoint or migration facility enables longrunning jobs to compute productively on non-dedicated resources. The work the job performs with each allocation is saved in a checkpoint, so the job's state can be transferred to a new execution site where it can continue the computation. A remote data access facility enables jobs to compute on resources that are not co-located with their data. Remote data access might involve transferring the job's data across a local area supercomputer network or a wide area network. These checkpoint and data transfers can generate significant network load.; The HTC environment must manage network resources carefully to use computational resources efficiently while honoring administrative policies. This dissertation explores the network requirements of batch jobs and presents mechanisms for managing network resources to implement administrative policies and improve job goodput. Goodput represents the job's forward progress and can differ from the job's allocated CPU time because of network overheads (when the job blocks on network I/O) and checkpoint rollback (when the job must “roll back” to a previous checkpoint). The primary contribution of this work is the definition and implementation of a network and CPU co-allocation framework for HTC environments. Making the network an allocated resource enables the system to implement administrative network policies and to improve job goodput via network admission control and scheduling.
机译:通过在网络上池化可用的计算资源,高吞吐量计算(HTC)环境可在长时间内为用户提供大量计算能力。 HTC环境致力于在向其客户提供有用的计算服务的同时,尊重计算资源的许多不同所有者和管理员设置的各种资源使用策略。这需要灵活的调度机制,以根据作业的需求以及可用资源的属性和策略将作业与兼容的计算资源进行匹配。它还需要一些机制来帮助作业变得更加敏捷,以便它们可以成功地根据当前可用的资源进行计算。检查点或迁移工具使长时间运行的作业可以在非专用资源上进行高效的计算。每次分配作业执行的工作都保存在检查点中,因此可以将作业的状态转移到新的执行站点,在该站点中可以继续计算。远程数据访问工具使作业可以在未与其数据共存的资源上进行计算。远程数据访问可能涉及跨局域网超级计算机网络或广域网传输作业的数据。这些检查点和数据传输会产生很大的网络负载。 HTC环境必须认真管理网络资源,以便在遵守管理策略的同时有效地使用计算资源。本文探讨了批处理作业的网络需求,提出了管理网络资源的机制,以实现管理策略和提高作业效率。吞吐量表示作业的前进进度,由于网络开销(当作业在网络I / O上阻塞时)和检查点回滚(当作业必须“回滚”到上一个检查点时),Goodput可能与作业分配的CPU时间不同。这项工作的主要贡献是为HTC环境定义和实现了网络和CPU协同分配框架。使网络成为已分配的资源使系统能够通过网络准入控制和调度来实施管理网络策略并提高作业吞吐量。

著录项

  • 作者

    Basney, James Alan.;

  • 作者单位

    The University of Wisconsin - Madison.;

  • 授予单位 The University of Wisconsin - Madison.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2001
  • 页码 114 p.
  • 总页数 114
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号