首页> 外文会议>19th international symposium on high performance distributed computing 2010 >File-Access Patterns of Data-Intensive Workflow Applications and their Implications to Distributed Filesystems
【24h】

File-Access Patterns of Data-Intensive Workflow Applications and their Implications to Distributed Filesystems

机译:数据密集型工作流应用程序的文件访问模式及其对分布式文件系统的影响

获取原文
获取原文并翻译 | 示例

摘要

This paper studies five real-world data intensive workflow applications in the fields of natural language processing, astronomy image analysis, and web data analysis. Data intensive workflows are increasingly becoming important applications for cluster and Grid environments. They open new challenges to various components of workflow execution environments including job dispatchers, schedulers, file systems, and file staging tools. The keys to achieving high performance are efficient data sharing among executing hosts and locality-aware scheduling that reduces the amount of data transfer. While much work has been done on scheduling workflows, many of them use synthetic or random workload. As such, their impacts on real workloads are largely unknown. Understanding characteristics of real-world workflow applications is a required step to promote research in this area. To this end, we analyse real-world workflow applications focusing on their file access patterns and summarize their implications to schedulers and file system/staging designs.
机译:本文研究了自然语言处理,天文学图像分析和Web数据分析领域中的五个实际数据密集型工作流应用程序。数据密集型工作流越来越成为集群和Grid环境的重要应用程序。它们给工作流执行环境的各个组件(包括作业调度程序,调度程序,文件系统和文件登台工具)带来了新的挑战。实现高性能的关键是执行主机之间的有效数据共享以及减少数据传输量的位置感知调度。尽管在计划工作流方面已完成许多工作,但其中许多工作使用合成或随机工作量。因此,它们对实际工作负载的影响在很大程度上尚不清楚。了解实际工作流应用程序的特征是促进该领域研究的必要步骤。为此,我们重点分析现实世界中的工作流应用程序的文件访问模式,并总结它们对调度程序和文件系统/登台设计的影响。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号