首页> 外文学位 >Achieving higher dependability through host and NIC processor collaboration.
【24h】

Achieving higher dependability through host and NIC processor collaboration.

机译:通过主机和NIC处理器的协作实现更高的可靠性。

获取原文
获取原文并翻译 | 示例

摘要

Traditionally, distributed systems requiring high dependability were designed using custom hardware with massive amounts of redundancy. Not only the nodes, but the network, was replicated in most of these systems. Recently, the need for cost reduction and access to the latest commercial technologies has prompted the use of commercial off-the-shelf (COTS) hardware and software products in the design of such systems. On the other hand, reliance on COTS technology brings about new challenges in system reliability. This dissertation attempts to address these challenges by developing fault tolerance techniques for modern high-speed networking-based systems.;Being driven by the demand for greater network performance, emerging network technologies have complex network interfaces with a Network Interface Card (NIC) processor and large local memory. However, increasing complexity results in a larger set of failure points and a potential increase in the network failure rate. This is in addition to the system failures that can be caused by faults that strike the host system. In this dissertation, we propose to achieve higher dependability of distributed systems through host and NIC processor collaboration. The host processor will detect and recover a failed network interface, and in addition, the symbiotic relationship allows the NIC processor to aid in the recovery of a failed host system or application. More specifically, we present an effective low-overhead adaptive and concurrent self-testing technique to protect programmable high-speed network interfaces, and a low-overhead message logging protocols to achieve fast recovery from host application crashes.
机译:传统上,要求高可靠性的分布式系统是使用具有大量冗余的定制硬件设计的。在大多数这些系统中,不仅复制了节点,而且复制了网络。最近,对降低成本和获取最新商业技术的需求促使人们在此类系统的设计中使用商业现成的(COTS)硬件和软件产品。另一方面,对COTS技术的依赖给系统可靠性带来了新的挑战。本文试图通过为现代高速联网系统开发容错技术来解决这些挑战。在对更高网络性能的需求驱动下,新兴的网络技术具有复杂的网络接口以及网络接口卡(NIC)处理器和大的本地内存。但是,复杂性的增加会导致更多的故障点,并可能增加网络故障率。这是由于主机系统故障引起的系统故障之外的补充。本文提出通过主机与网卡处理器的协作来实现分布式系统更高的可靠性。主机处理器将检测并恢复发生故障的网络接口,此外,共生关系允许NIC处理器帮助恢复发生故障的主机系统或应用程序。更具体地说,我们提出了一种有效的低开销自适应和并发自检技术来保护可编程高速网络接口,以及一种低开销消息日志记录协议以实现从主机应用程序崩溃中快速恢复。

著录项

  • 作者

    Zhou, Yizheng.;

  • 作者单位

    University of Massachusetts Amherst.;

  • 授予单位 University of Massachusetts Amherst.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2008
  • 页码 82 p.
  • 总页数 82
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号