首页> 外文会议>International Conference on Parallel and Distributed Computing, Applications and Technologies >New Replication Strategy Based on Maximal Frequent Correlated Pattern Mining for Data Grids
【24h】

New Replication Strategy Based on Maximal Frequent Correlated Pattern Mining for Data Grids

机译:基于最大频繁相关模式挖掘的数据网格新复制策略

获取原文

摘要

Data replication in data grids is an efficient technique that aims to improve response time, reduce the bandwidth consumption and maintain reliability. In this context, a lot of work is done and many strategies have been proposed. Unfortunately, most of existing replication techniques are based on single file granularity and neglect correlation among different data files. Indeed, file correlations become an increasingly important consideration for performance enhancement in data grids. In fact, the analysis of real data intensive grid applications reveals that job requests for groups of correlated files and suggests that these correlations can be exploited for improving the effectiveness of replication strategies. In this paper, we propose a new dynamic periodic decentralized data replication strategy, called RSBMFCP (1), which consider a set of correlated files as granularity. Our strategy gathers files according to a relationship of simultaneous accesses between files by jobs and stores correlated files at the same site. In order to find out these correlations, a maximal frequent correlated pattern mining algorithm of the data mining field is introduced. We choose the all-confidence as correlation measure. The proposed strategy consists of four steps: storing file access history, converting the file access history into a logical history file, applying maximal frequent correlated pattern mining algorithm and performing replication and replacement. Experiments using the well-known data grid simulator Opt or Sim show that our proposed strategy has better performance in comparison with other strategies in terms of job execution time and effective network usage.
机译:数据网格中的数据复制是一种有效的技术,旨在缩短响应时间,减少带宽消耗并保持可靠性。在这种情况下,完成了许多工作,并提出了许多策略。不幸的是,大多数现有的复制技术都是基于单个文件的粒度,并且忽略了不同数据文件之间的相关性。实际上,文件关联已成为提高数据网格性能的越来越重要的考虑因素。实际上,对实际数据密集型网格应用程序的分析显示,对相关文件组的作业请求表明,可以利用这些相关性来提高复制策略的有效性。在本文中,我们提出了一种新的动态周期性分散数据复制策略,称为RSBMFCP(1),该策略将一组相关文件视为粒度。我们的策略根据作业之间文件同时访问的关系来收集文件,并将相关文件存储在同一站点。为了找出这些相关性,引入了数据挖掘领域的最大频繁相关模式挖掘算法。我们选择所有置信度作为相关度量。所提出的策略包括四个步骤:存储文件访问历史记录,将文件访问历史记录转换为逻辑历史记录文件,应用最大频繁相关模式挖掘算法以及执行复制和替换。使用著名的数据网格模拟器Opt或Sim进行的实验表明,在作业执行时间和有效网络使用方面,我们提出的策略与其他策略相比具有更好的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号