Surviving Failures in Bandwidth-Constrained Datacenters

机译：带宽受限的数据中心中仍然存在的故障

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Datacenter networks have been designed to tolerate failures of network equipment and provide sufficient bandwidth. In practice, however, failures and maintenance of networking and power equipment often make tens to thousands of servers unavailable, and network congestion can increase service latency. Unfortunately, there exists an inherent tradeoff between achieving high fault tolerance and reducing bandwidth usage in network core; spreading servers across fault domains improves fault tolerance, but requires additional bandwidth, while deploying servers together reduces bandwidth usage, but also decreases fault tolerance. We present a detailed analysis of a large-scale Web application and its communication patterns. Based on that, we propose and evaluate a novel optimization framework that achieves both high fault tolerance and significantly reduces bandwidth usage in the network core by exploiting the skewness in the observed communication patterns.

机译：数据中心网络已被设计为可以承受网络设备的故障并提供足够的带宽。但是，实际上，网络和电源设备的故障和维护通常会使成千上万的服务器不可用，并且网络拥塞会增加服务延迟。不幸的是，在实现高容错能力和减少网络核心的带宽使用之间存在着固有的权衡。在故障域中分布服务器可以提高容错能力，但需要额外的带宽，同时将服务器部署在一起可以减少带宽使用率，但同时也可以降低容错能力。我们将对大型Web应用程序及其通信模式进行详细分析。在此基础上，我们提出并评估了一种新颖的优化框架，该框架通过利用观察到的通信模式中的偏斜度，既可以实现较高的容错能力，又可以显着减少网络核心中的带宽使用。

著录项

来源
《Proceedings of the ACM SIGCOMM 2012 conference applications, technologies, architectures, and protocols for computer communication》|2012年|431-442|共12页
会议地点 Helsinki(FI)
作者
Peter Bodik; Ishai Menache; Mosharaf Chowdhury; Pradeepkumar Mani; David A. Maltz; Ion Stoica;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类
关键词
dataeenter networks; fault tolerance; bandwidth;

机译：数据中心网络；容错带宽;

相似文献

外文文献
中文文献
专利

1. Surviving Failures in Bandwidth-Constrained Datacenters [J] . Peter Bodik, Ishai Menache, Mosharaf Chowdhury, Computer communication review . 2012,第4期

机译：带宽受限的数据中心中仍然存在的故障
2. Surviving switch failures in cloud datacenters [J] . Rachee Singh, Muqeet Mukhtar, Ashay Krishna, Computer communication review . 2021,第2期

机译：云数据中心幸存的交换机失败
3. Availability-Aware Virtual Cluster Allocation in Bandwidth-Constrained Datacenters [J] . Jialei Liu, Shangguang Wang, Ao Zhou, Services Computing, IEEE Transactions on . 2020,第3期

机译：带宽约束数据中心中的可用性感知虚拟群集分配
4. Surviving Failures in Bandwidth-Constrained Datacenters [C] . Peter Bodik, Ishai Menache, Mosharaf Chowdhury, ACM SIGCOMM conference applications, technologies, architectures, and protocols for computer communication . 2012

机译：带宽约束数据中心的幸存失败
5. Robust and Survivable Network Design Considering Uncertain Node and Link Failures. [D] . Sadeghi, Elham. 2016

机译：考虑不确定节点和链路故障的鲁棒且可生存的网络设计。
6. Survivable Deployments of Optical Sensor Networks against Multiple Failures and Disasters: A Survey [O] . Yongjun Zhang, Jingjie Xin 2019

机译：针对多种故障和灾难的光传感器网络的可持续部署：一项调查
7. Surviving Failures in Bandwidth-Constrained Datacenters [O] . Peter Bodík, Pradeepkumar Mani, Ishai Menache, 2013

机译：在带宽受限的数据中心中存活失败
8. Correlated Failures in Survivable Storage Systems [R] . Bakkaloglu, M. , Wylie, J. J. , Wang, C. , 2002

机译：可存活存储系统中的相关故障

Surviving Failures in Bandwidth-Constrained Datacenters

摘要

著录项

相似文献

相关主题

期刊订阅