首页> 外文期刊>Journal of Parallel and Distributed Computing >Cluster-to-cluster data transfer with data compression over wide-area networks
【24h】

Cluster-to-cluster data transfer with data compression over wide-area networks

机译:通过广域网进行数据压缩的集群到集群数据传输

获取原文
获取原文并翻译 | 示例

摘要

The recent emergence of ultra high-speed networks up to 100 Gb/s has posed numerous challenges and has led to many investigations on efficient protocols to saturate 100 Gb/s links. However, end-to-end data transfers involve many components, not only protocols, affecting overall transfer performance. These components include disk I/O subsystem, additional computation associated with data streams, and network adapters. For example, achievable bandwidth by TCP may not be implementable if disk I/O or CPU becomes a bottleneck in end-to-end data transfer. In this paper, we first model all the system components involved in end-to-end data transfer as a graph. We then formulate the problem whose goal is to achieve maximum data transfer throughput using parallel data flows. We also propose a variable data flow GridFTP XIO stack to improve data transfer with data compression. Our contributions lie in how to optimize data transfers considering all the system components involved rather than in accurately modeling all the system components involved. Our proposed formulations and solutions are evaluated through experiments on the ESnet 100G testbed and a wide-area cluster-to-cluster testbed. The experimental results on the ESnet 100G testbed show that our approach is several times faster than Globus Online-8 × faster for datasets with many 10 MB files and 3-4 × faster for other datasets of larger size files. The experimental results on the cluster-to-cluster testbed show that our variable data flow approach is up to 4 × faster than a normal cluster data transfer.
机译:高达100 Gb / s的超高速网络的最新出现提出了许多挑战,并导致了许​​多有关使100 Gb / s链路饱和的有效协议的研究。但是,端到端数据传输涉及许多组件,而不仅仅是协议,这会影响整体传输性能。这些组件包括磁盘I / O子系统,与数据流关联的其他计算以及网络适配器。例如,如果磁盘I / O或CPU成为端到端数据传输的瓶颈,则TCP无法实现的带宽可能无法实现。在本文中,我们首先将涉及端到端数据传输的所有系统组件建模为图形。然后,我们提出一个问题,其目标是使用并行数据流来实现最大的数据传输吞吐量。我们还提出了可变数据流GridFTP XIO堆栈,以通过数据压缩来改善数据传输。我们的贡献在于,如何在考虑所有相关系统组件的情况下优化数据传输,而不是对所有相关系统组件进行准确建模。通过在ESnet 100G测试台和广域集群到集群测试台上进行实验,评估了我们提出的配方和解决方案。在ESnet 100G测试平台上的实验结果表明,我们的方法比Globus Online快几倍,对于具有10 MB文件的数据集,速度是8倍,对于其他较大文件的数据集,速度是3-4倍。在群集到群集测试床上的实验结果表明,我们的可变数据流方法比普通群集数据传输快4倍。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号