首页> 外文学位 >Enabling dynamic interactions in large scale applications and scientific workflows using semantically specialized shared dataspaces.
【24h】

Enabling dynamic interactions in large scale applications and scientific workflows using semantically specialized shared dataspaces.

机译:使用语义专用共享数据空间在大规模应用程序和科学工作流程中启用动态交互。

获取原文
获取原文并翻译 | 示例

摘要

Emerging scientific and engineering applications use large-scale parallel machines to simulate, with higher accuracy, complex physical phenomena consisting of dynamically interacting processes. The workflows associated with these applications consist of parallel application codes that need to coordinate and interact at runtime. The interactions typically involve large volumes of data that must be exchanged and processed by the codes. The heterogeneous nature of the coupled codes, their numerical formulations, and their data decompositions lead to complex and dynamic interaction and data exchange patterns that are only defined at runtime. Moreover, these simulations often run on separate resources and progress at different rates, which adds to their complexity.;Efficient and scalable implementation of these coupled application workflows present several challenging programming, orchestration, coordination, and data exchange requirements. Existing programming frameworks, however, are rigid and provide limited support for the dynamic interactions manifested by these applications. For example, existing frameworks need to gather global application knowledge, impose tight synchronization between applications, or demand pre-defined and static interaction patterns that must be known prior to execution. These constraints can introduce significant performance penalties and can limit application interaction programming expressiveness.;This thesis explores a new communication and coordination model to enable flexible and asynchronous application coupling for coupled applications workflows. It derives from the tuple-space model and provides the abstraction of a virtual distributed shared-space, which is customized for the application data domain. It enables applications to coordinate and exchange data by inserting and retrieving data objects. This model does not impose any synchronization requirements between independent applications. Data stored on the space can be accessed by multiple applications, which can associatively query the space and retrieve data objects. Furthermore, it enables decoupled and dynamic interactions driven by application computations.;This thesis presents DataSpaces, a prototype implementation of the distributed shared-space model. DataSpaces enables memory-to-memory application coupling and transparent data redistribution. It can complement existing workflow engines to enable in-memory data transports between distributed applications that run on separate resources as part of end-to-end scientific workflows. The thesis also presents ActiveSpaces, which extends DataSpaces and the shared-space model to enable in-transit data processing. It proposes and demonstrates a shift in the data processing paradigm by moving processing code closer to the data. ActiveSpaces provides programming support for defining data processing routines, and a runtime execution system to deploy and remotely execute these routine on the space. The research concepts and software frameworks have been deployed and evaluated using real application workflows in production runs on high-end computing systems.
机译:新兴的科学和工程应用程序使用大型并行机以更高的精度模拟由动态交互过程组成的复杂物理现象。与这些应用程序关联的工作流程由并行应用程序代码组成,这些代码需要在运行时进行协调和交互。交互通常涉及必须由代码交换和处理的大量数据。耦合代码的异质性,其数字表示形式以及其数据分解导致仅在运行时定义的复杂而动态的交互和数据交换模式。此外,这些模拟通常在单独的资源上运行,并且以不同的速率运行,这增加了它们的复杂性。这些耦合的应用程序工作流的有效且可扩展的实现提出了一些具有挑战性的编程,业务流程,协调和数据交换要求。但是,现有的编程框架是严格的,并且对这些应用程序表现出的动态交互提供有限的支持。例如,现有框架需要收集全局应用程序知识,在应用程序之间施加紧密同步,或者需要在执行之前必须知道的预定义和静态交互模式。这些约束可能会引入显着的性能损失,并可能限制应用程序交互编程的表现力。本论文探索了一种新的通信和协调模型,以实现耦合应用程序工作流的灵活和异步应用程序耦合。它从元组空间模型派生而来,并提供了针对应用程序数据域定制的虚拟分布式共享空间的抽象。它使应用程序可以通过插入和检索数据对象来协调和交换数据。此模型不对独立应用程序之间的同步提出任何要求。存储在空间上的数据可以由多个应用程序访问,这些应用程序可以关联查询空间并检索数据对象。此外,它还可以实现由应用程序计算驱动的去耦和动态交互。 DataSpaces支持内存到内存的应用程序耦合和透明的数据重新分配。它可以补充现有的工作流引擎,以实现在作为端对端科学工作流一部分运行于单独资源上的分布式应用程序之间的内存中数据传输。本文还介绍了ActiveSpaces,它扩展了DataSpaces和共享空间模型以实现在途数据处理。通过将处理代码移近数据,提出并演示了数据处理范式的转变。 ActiveSpaces提供了用于定义数据处理例程的编程支持,并提供了运行时执行系统以在空间上部署和远程执行这些例程。研究概念和软件框架已在高端计算系统上的生产运行中使用实际应用程序工作流进行了部署和评估。

著录项

  • 作者

    Docan, Ciprian.;

  • 作者单位

    Rutgers The State University of New Jersey - New Brunswick.;

  • 授予单位 Rutgers The State University of New Jersey - New Brunswick.;
  • 学科 Engineering Computer.
  • 学位 Ph.D.
  • 年度 2011
  • 页码 140 p.
  • 总页数 140
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-17 11:44:19

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号