首页> 外文期刊>Computers, IEEE Transactions on >Dynamic On-the-Fly Minimum Cost Benchmarking for Storing Generated Scientific Datasets in the Cloud
【24h】

Dynamic On-the-Fly Minimum Cost Benchmarking for Storing Generated Scientific Datasets in the Cloud

机译:动态实时最低成本基准测试,用于在云中存储生成的科学数据集

获取原文
获取原文并翻译 | 示例
           

摘要

Massive computation power and storage capacity of cloud computing systems enable users to either store large generated scientific datasets in the cloud or delete and then regenerate them whenever reused. Due to the pay-as-you-go model, the more datasets we store, the more storage cost we need to pay, alternatively, we can delete some generated datasets to save the storage cost but more computation cost is incurred for regeneration whenever the datasets are reused. Hence, there should exist a trade-off between computation and storage in the cloud, where different storage strategies lead to different total costs. The minimum cost, which reflects the best trade-off, is an important benchmark for evaluating the cost-effectiveness of different storage strategies. However, the current benchmarking approach is neither efficient nor practical to be applied on the fly at runtime. In this paper, we propose a novel Partitioned Solution Space based approach with efficient algorithms for dynamic yet practical on-the-fly minimum cost benchmarking of storing generated datasets in the cloud. In this approach, we pre-calculate all the possible minimum cost storage strategies and save them in different partitioned solution spaces. The minimum cost storage strategy represents the minimum cost benchmark, and whenever the datasets storage cost changes at runtime in the cloud (e.g. new datasets are generated and/or existing datasets’ usage frequencies are changed), our algorithms can efficiently retrieve the current minimum cost storage strategy from the partitioned solution space and update the benchmark. By dynamically keeping the benchmark updated, our approach can be practically utilised on the fly at runtime in the cloud, based on which the minimum cost benchmark can be either proactively reported or instantly responded upon request. Case studies and experimental results based on Amazon cloud show the efficiency, scalability and practicality of our approach.
机译:云计算系统的强大计算能力和存储能力使用户可以将大量生成的科学数据集存储在云中,也可以删除它们,然后在重新使用时重新生成它们。由于按需购买即付模型,我们存储的数据集越多,我们需要支付的存储成本就越高,或者,我们可以删除一些生成的数据集以节省存储成本,但是只要数据集被重用。因此,应在云计算和存储之间进行权衡,其中不同的存储策略会导致不同的总成本。反映最佳平衡的最低成本是评估不同存储策略的成本效益的重要基准。但是,当前的基准测试方法在运行时即时应用既不高效也不实用。在本文中,我们提出了一种新颖的基于分区解决方案空间的方法,该方法具有有效的算法,可动态,实用,实时地将生成的数据集存储在云中的最低成本进行基准测试。在这种方法中,我们预先计算了所有可能的最低成本存储策略,并将它们保存在不同的分区解决方案空间中。最低成本存储策略代表最低成本基准,并且每当数据集的存储成本在运行时在云中更改时(例如,生成新的数据集和/或更改现有的数据集的使用频率),我们的算法就可以有效地检索当前的最低成本从分区解决方案空间存储策略并更新基准。通过动态地保持基准的更新,我们的方法可以在运行时在云中实际应用,基于此,可以主动报告最低成本基准,也可以根据要求立即做出响应。基于Amazon云的案例研究和实验结果表明了我们方法的效率,可扩展性和实用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号