首页> 外文会议> >Improving the performance of coordinated checkpointers on networks of workstations using RAID techniques
【24h】

Improving the performance of coordinated checkpointers on networks of workstations using RAID techniques

机译:使用RAID技术提高工作站网络上协调检查点的性能

获取原文

摘要

Coordinated checkpointing systems are popular and general-purpose tools for implementing process migration, coarse-grained job swapping, and fault-tolerance on networks of workstations. Though simple in concept, there are several design decisions concerning the placement of checkpoint files that can impact the performance and functionality of coordinated checkpointers. Although several such checkpointers have been implemented for popular programming platforms like PVM and MPI, none have taken this issue into consideration. This paper addresses the issue of checkpoint placement and its impact on the performance and functionality of coordinated checkpointing systems. Several strategies, both old and new, are described and implemented on a network of SPARC-5 workstations running PVM. These strategies range from very simple to more complex borrowing heavily from ideas in RAID (Redundant Arrays of Inexpensive Disks) fault-tolerance. The results of this paper will serve as a guide so that future implementations of coordinated checkpointing can allow their users to achieve the combination of performance and functionality that is right for their applications.
机译:协作式检查点系统是流行的通用工具,用于在工作站网络上实施流程迁移,粗粒度作业交换和容错。尽管从概念上讲很简单,但仍存在一些有关检查点文件放置的设计决策,这些决定可能会影响协调检查点的性能和功能。尽管已经为流行的编程平台(例如PVM和MPI)实现了几个这样的检查点,但没有一个将这一问题考虑在内。本文讨论了检查点放置的问题及其对协作式检查点系统的性能和功能的影响。在运行PVM的SPARC-5工作站网络上描述并实现了几种新旧策略。这些策略的范围很广,从非常简单到更复杂,都是从RAID(廉价磁盘冗余阵列)的容错能力中借鉴而来的。本文的结果将作为指南,以便将来协调检查点的实现可以使他们的用户实现适合其应用程序的性能和功能的组合。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号