【24h】

Surviving Failures in Bandwidth-Constrained Datacenters

机译:带宽受限的数据中心中仍然存在的故障

获取原文
获取原文并翻译 | 示例

摘要

Datacenter networks have been designed to tolerate failures of network equipment and provide sufficient bandwidth. In practice, however, failures and maintenance of networking and power equipment often make tens to thousands of servers unavailable, and network congestion can increase service latency. Unfortunately, there exists an inherent tradeoff between achieving high fault tolerance and reducing bandwidth usage in network core; spreading servers across fault domains improves fault tolerance, but requires additional bandwidth, while deploying servers together reduces bandwidth usage, but also decreases fault tolerance. We present a detailed analysis of a large-scale Web application and its communication patterns. Based on that, we propose and evaluate a novel optimization framework that achieves both high fault tolerance and significantly reduces bandwidth usage in the network core by exploiting the skewness in the observed communication patterns.
机译:数据中心网络已被设计为可以承受网络设备的故障并提供足够的带宽。但是,实际上,网络和电源设备的故障和维护通常会使成千上万的服务器不可用,并且网络拥塞会增加服务延迟。不幸的是,在实现高容错能力和减少网络核心的带宽使用之间存在着固有的权衡。在故障域中分布服务器可以提高容错能力,但需要额外的带宽,同时将服务器部署在一起可以减少带宽使用率,但同时也可以降低容错能力。我们将对大型Web应用程序及其通信模式进行详细分析。在此基础上,我们提出并评估了一种新颖的优化框架,该框架通过利用观察到的通信模式中的偏斜度,既可以实现较高的容错能力,又可以显着减少网络核心中的带宽使用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号