首页> 外文会议>IEEE International Congress on Big Data >Managing hot metadata for scientific workflows on multisite clouds
【24h】

Managing hot metadata for scientific workflows on multisite clouds

机译:在多站点云上管理科学工作流的热元数据

获取原文

摘要

Large-scale scientific applications are often expressed as workflows that help defining data dependencies between their different components. Several such workflows have huge storage and computation requirements, and so they need to be processed in multiple (cloud-federated) datacenters. It has been shown that efficient metadata handling plays a key role in the performance of computing systems. However, most of this evidence concern only single-site, HPC systems to date. In this paper, we present a hybrid decentralized/distributed model for handling hot metadata (frequently accessed metadata) in multisite architectures. We couple our model with a scientific workflow management system (SWfMS) to validate and tune its applicability to different real-life scientific scenarios. We show that efficient management of hot metadata improves the performance of SWfMS, reducing the workflow execution time up to 50% for highly parallel jobs and avoiding unnecessary cold metadata operations.
机译:大型科学应用程序通常表示为帮助定义其不同组件之间的数据依存关系的工作流。几个这样的工作流程具有巨大的存储和计算需求,因此需要在多个(云联合)数据中心中进行处理。已经表明,有效的元数据处理在计算系统的性能中起关键作用。但是,迄今为止,大多数证据仅涉及单站点HPC系统。在本文中,我们提出了一种混合的分散/分布式模型,用于处理多站点体系结构中的热元数据(经常访问的元数据)。我们将模型与科学的工作流管理系统(SWfMS)结合使用,以验证和调整其对不同现实生活中的科学场景的适用性。我们显示出对热元数据的有效管理可以提高SWfMS的性能,对于高度并行的作业,可以将工作流执行时间减少多达50%,并且可以避免不必要的冷元数据操作。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号