...
首页> 外文期刊>Parallel Algorithms and Applications >Highly reliable message-passing mechanism for cluster file system
【24h】

Highly reliable message-passing mechanism for cluster file system

机译:集群文件系统的高度可靠的消息传递机制

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

With the increase in personal computer clusters in popularity and quantity, message passing between nodes has been an important issue for high failure rate in the network. File access in a cluster file system often contains several sub-operations; each includes one or more network transmissions. Any network failures cause the file system service unavailable. In this paper, we describe a highly reliable message-passing mechanism (HR-NET), which tolerates both software and hardware network failures. HR-NET provides fine-grained, connection-level failover across redundant communication paths. With it, the file system can keep passing messages because HR-NET handles failures automatically by either recovery from network failures or failed over to a backup; therefore, it screens network failures from requests or data transmission of cluster file system. Load balance for messages is also achieved to relieve network traffic. For transmission timeout, HR-NET proposes a priority-based message scheduling which dynamically manages messages in an appropriate order to tolerate request-response failures between clients and servers. HR-NET is implemented upon standard network protocol stack. Performance results show that HR-NET can provide almost full underlying network bandwidth with average 6.17% throughput loss and provide a fast recovery. Experiments with cluster file system show that the overall performance degradation is below 8% due to failover of HR-NET while the reliability is highly enhanced.
机译:随着个人计算机集群的普及和数量的增加,节点之间的消息传递已成为网络中高故障率的重要问题。集群文件系统中的文件访问通常包含几个子操作;例如,每个都包含一个或多个网络传输。任何网络故障都会导致文件系统服务不可用。在本文中,我们描述了一种高度可靠的消息传递机制(HR-NET),它可以容忍软件和硬件网络故障。 HR-NET在冗余通信路径上提供细粒度的连接级故障转移。有了它,文件系统可以保持传递消息的速度,因为HR-NET可以通过从网络故障中恢复或故障转移到备份中来自动处理故障。因此,它可以从群集文件系统的请求或数据传输中筛选出网络故障。还可以实现消息的负载平衡,以减轻网络流量。对于传输超时,HR-NET提出了基于优先级的消息调度,该消息调度以适当的顺序动态管理消息,以容忍客户端和服务器之间的请求-响应失败。 HR-NET是在标准网络协议堆栈上实现的。性能结果表明,HR-NET可以提供几乎全部的基础网络带宽,平均吞吐量损失为6.17%,并且可以快速恢复。群集文件系统的实验表明,由于HR-NET的故障转移,总体性能下降了8%以下,而可靠性得到了极大提高。

著录项

  • 来源
    《Parallel Algorithms and Applications》 |2013年第6期|556-575|共20页
  • 作者单位

    Graduate University of Chinese Academy of Sciences, Beijing, P.R. China,Institute of Computing Technology, Integration Application Center, Chinese Academy of Sciences, Beijing, P.R. China;

    Graduate University of Chinese Academy of Sciences, Beijing, P.R. China,Institute of Computing Technology, Integration Application Center, Chinese Academy of Sciences, Beijing, P.R. China;

    State Key Laboratory of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, P.R. China;

    Institute of Computing Technology, Integration Application Center, Chinese Academy of Sciences, Beijing, P.R. China;

    Institute of Computing Technology, Integration Application Center, Chinese Academy of Sciences, Beijing, P.R. China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    cluster file system; message-passing mechanism; high reliability; fault tolerance;

    机译:集群文件系统;消息传递机制;高可靠性;容错;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号