首页> 外文学位 >Solving practical problems in datacenter networks.
【24h】

Solving practical problems in datacenter networks.

机译:解决数据中心网络中的实际问题。

获取原文
获取原文并翻译 | 示例

摘要

The soaring demands for always-on and fast-response online services have driven modern datacenter networks to undergo tremendous growth. These networks often rely on scale-out designs with large numbers of commodity switches to reach immense capacity while keeping capital expenses under check. Today, datacenter network operators spend tremendous time and efforts on two key challenges: 1) how to efficiently utilize the bandwidth connecting host pairs and 2) how to promptly handle network failures with minimal disruptions to the hosted services.;To resolve the first challenge, we propose solutions in both network layer and transport layer. In the network layer solution, We advocate to design practical datacenter architectures for easy operation, i.e., an architecture should be reliable, capable of improving bisection bandwidth, scalable and debugging-friendly. By strictly following these four guidelines, We propose DARD, a Distributed Adaptive Routing architecture for Datacenter networks. DARD allows each end host to reallocate traffic from overloaded paths to underloaded paths without central coordination. We use congestion game theory to show that DARD converges to a Nash equilibrium in finite steps and its gap to the optimal flow allocation is bounded in the order of 1/logL, with L being the number of links. We use a testbed implementation and simulations to show that DARD can achieve a close-to-optimal flow allocation with small control overhead in practice.;In the transport layer solution, We propose Explicit Multipath Congestion Control Protocol (MPXCP), which achieves four desirable properties: fast convergence, efficiency, being fair to flows with different RTTs and negligible queue size. Intensive ns-2 simulation shows that MPXCP can quickly converge to efficiency and fairness without building up queues despite different delay-bandwidth products.;To resolve the second challenge, recent research efforts have focused on automatic failure localization. Yet, resolving failures still requires significant human interventions, resulting in prolonged failure recovery time. Unlike previous work, we propose NetPilot, a system aims to quickly mitigate rather than resolve failures. NetPilot mitigates failures in much the same way operators do -- by deactivating or restarting suspected offending components. NetPilot circumvents the need for knowing the exact root cause of a failure by taking an intelligent trial-and-error approach. The core of NetPilot is comprised of an Impact Estimator that helps guard against overly disruptive mitigation actions and a failure-specific mitigation planner that minimizes the number of trials. We demonstrate that NetPilot can effectively mitigate several types of critical failures commonly encountered in production datacenter networks.
机译:对始终在线和快速响应的在线服务的猛增需求推动了现代数据中心网络的迅猛发展。这些网络通常依靠具有大量商品交换机的横向扩展设计来达到巨大的容量,同时控制资本支出。如今,数据中心网络运营商花费了大量时间和精力来应对两个关键挑战:1)如何有效利用连接主机对的带宽,以及2)如何在对托管服务造成最小影响的情况下及时处理网络故障。我们在网络层和传输层都提出了解决方案。在网络层解决方案中,我们提倡设计实用的数据中心架构,以简化操作,即,架构应可靠,能够提高对等带宽,可扩展且易于调试。通过严格遵循这四个准则,我们提出了DARD,一种用于数据中心网络的分布式自适应路由体系结构。 DARD允许每个终端主机将通信从重载路径重新分配到重载路径,而无需中央协调。我们使用拥塞博弈理论表明,DARD以有限的步长收敛到Nash均衡,并且它与最佳流量分配的差距以1 / logL的数量级为界,L是链接数。我们通过测试平台的实现和仿真表明,DARD在实践中可以实现几乎最佳的流分配,并且控制开销很小。在传输层解决方案中,我们提出了显式多路径拥塞控制协议(MPXCP),该协议可以实现四个目标特性:快速收敛,高效,对具有不同RTT的流公平且队列大小可忽略不计。密集的ns-2仿真表明,尽管延迟带宽产品不同,MPXCP仍可以快速收敛到效率和公平性,而无需建立队列。为了解决第二个挑战,最近的研究工作集中在自动故障定位上。但是,解决故障仍然需要大量的人工干预,从而导致故障恢复时间延长。与以前的工作不同,我们建议使用NetPilot,该系统旨在快速缓解而不是解决故障。 NetPilot以与操作员相同的方式来减轻故障,方法是停用或重新启动可疑的故障组件。 NetPilot通过采取智能的试错法来避免了解故障的确切原因的需求。 NetPilot的核心包括一个影响估计器(该影响估计器可帮助防止过度破坏性的缓解措施)以及一个特定于故障的缓解计划程序,该计划程序可以最大程度地减少试验次数。我们证明了NetPilot可以有效地缓解生产数据中心网络中常见的几种严重故障。

著录项

  • 作者

    Wu, Xin.;

  • 作者单位

    Duke University.;

  • 授予单位 Duke University.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2013
  • 页码 141 p.
  • 总页数 141
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-17 11:41:29

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号