首页> 外文会议>International Workshop on Grid Computing >Using Disk Throughput Data in Predictions of End-to-End Grid Data Transfers
【24h】

Using Disk Throughput Data in Predictions of End-to-End Grid Data Transfers

机译:在端到端网格数据传输的预测中使用磁盘吞吐量数据

获取原文

摘要

Data grids provide an environment for communities of researchers to share, replicate, and manage access to copies of large datasets. In such environments, fetching data from one of the several replica locations requires accurate predictions of end-to-end transfer times. Predicting transfer time is significantly complicated because of the involvement of several shared components, including networks and disks in the end-to-end data path, each of which experiences load variations that can significantly affect the throughput. Of these, disk accesses are rapidly growing in cost and have not been previously considered, although on some machines they can be up to 30% of the transfer time. In this paper, we present techniques to combine observations of end-to-end application behavior and disk I/O throughput load data. We develop a set of regression models to derive predictions that characterize the effect of disk load variations on file transfer times. We also include network component variations and apply these techniques to the logs of transfer data using the GridFTP server, part of the Globus Toolkit~(TM) We observe up to 9% improvement in prediction accuracy when compared with approaches based on past system behavior in isolation.
机译:数据网格为研究人员的社区提供共享,复制和管理对大型数据集的副本的社区的环境。在这样的环境中,从其中一个副本位置中获取数据需要准确地预测端到端传送时间。预测转移时间是显着复杂的,因为若干共享组件的参与,包括端到端数据路径中的网络和磁盘,每个都经历可能显着影响吞吐量的负载变化。其中,磁盘访问成本迅速增长,并且之前尚未考虑,尽管在某些机器上,它们可以高达转移时间的30%。在本文中,我们提出了结合端到端应用行为和磁盘I / O吞吐量负载数据的观测的技术。我们开发了一组回归模型,以导出特征磁盘负载变化对文件传输时间的影响的预测。我们还包括网络组件变体,并使用Gridftp服务器将这些技术应用于传输数据的日志,Globus Toolkit〜(TM)的一部分,我们观察到与基于过去的系统行为的方法相比,预测准确性的提高高达9%隔离。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号