首页> 外文期刊>Journal of Parallel and Distributed Computing >A data replication algorithm for groups of files in data grids
【24h】

A data replication algorithm for groups of files in data grids

机译:用于数据网格中文件组的数据复制算法

获取原文
获取原文并翻译 | 示例

摘要

AbstractData grid is emerging as the main part of the infrastructure for large-scale data intensive applications such as high energy physics and bioinformatics. The deployment of such infrastructures has allowed users of a grid site to gain access to a large amount of distributed data. Data replication is a key issue in a data grid and could be applied intelligently because it reduces data access time and bandwidth consumption for each grid site. In this paper, we introduce a new dynamic data replication algorithm named Popular Groups of Files Replication (PGFR). Our proposed algorithm is based on an assumption: users in a Virtual Organization have similar interests in groups of files. Based on this assumption, and file access history, PGFR builds a connectivity graph to recognize a group of dependent files in each grid site and replicates the most Popular Groups of Files to each grid site, thus increasing the local availability. We used OptorSim simulator to evaluate the efficiency of PGFR algorithm. The simulation results show that PGFR achieves better performance compared to the existing algorithm; PGFR minimized the mean job execution time, bandwidth consumption, and avoiding unnecessary replication.HighlightsOur algorithm named PGFR, considers dependency between files (data) for data replication.PGFR replicates a group of dependent files to the requester grid site.PGFR reduces mean job execution time, bandwidth consumption, and avoiding unnecessary replication.
机译: 摘要 数据网格正在成为大规模基础架构的主要部分数据密集型应用程序,例如高能物理和生物信息学。此类基础架构的部署已允许网格站点的用户访问大量分布式数据。数据复制是数据网格中的关键问题,可以智能地应用,因为它减少了每个网格站点的数据访问时间和带宽消耗。在本文中,我们介绍了一种新的动态数据复制算法,称为流行文件复制组(PGFR)。我们提出的算法基于以下假设:虚拟组织中的用户对文件组具有相似的兴趣。基于此假设和文件访问历史记录,PGFR会建立一个连接图以识别每个网格站点中的一组相关文件,并将最流行的文件组复制到每个网格站点,从而提高本地可用性。我们使用OptorSim模拟器来评估PGFR算法的效率。仿真结果表明,与现有算法相比,PGFR具有更好的性能。 PGFR最小化了平均作业执行时间,带宽消耗,并避免了不必要的复制。 突出显示 我们的算法PGFR,考虑了文件(数据)之间的依赖性以进行数据复制。 PGFR将一组相关文件复制到请求方网格站点。 PGFR减少了平均作业执行时间,降低了带宽消耗离子,并避免不必要的复制。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号