首页> 外文会议>IEEE International Conference on e-Science >Pilot-Streaming: A Stream Processing Framework for High-Performance Computing
【24h】

Pilot-Streaming: A Stream Processing Framework for High-Performance Computing

机译:导频流:用于高性能计算的流处理框架

获取原文

摘要

An increasing number of scientific applications utilize stream processing to analyze data feeds of scientific instruments, sensors, and simulations. In this paper, we study the streaming and data processing requirements of light source experiments, which are projected to generate data at 20 GB/sec in the near future. As beamtimes available to users are typically short, it is essential that processing and analysis can be conducted in a streaming mode. The development and deployment of streaming applications is a complex task and requires the integration of heterogeneous, distributed infrastructure, frameworks, middleware and application components written in different languages and abstractions. Streaming applications may be extremely dynamic due to factors, such as variable data rates, network congestions, and application-specific characteristics, such as adaptive sampling techniques and the different processing techniques. Consequently, streaming system are often subject to back-pressures and instabilities requiring additional infrastructure to mitigate these issues. We propose Pilot-Streaming, a framework for supporting streaming applications and their resource management needs on HPC infrastructure. Underlying Pilot-Streaming is a unifying architecture that decouples important concerns and functions, such as message brokering, transport and communication, and processing. Pilot-Streaming simplifies the deployment of stream processing frameworks, such as Kafka and Spark Streaming, while providing a high-level abstraction for managing streaming infrastructure, e.g. adding/removing resources as required by the application at runtime. This capability is critical for balancing complex streaming pipelines. To address the complexity in the development of streaming applications, we present the Streaming Mini-Apps, which supports different plug-able algorithms for data generation and processing, e. g., for reconstructing light source images using different techniques. We u
机译:越来越多的科学应用利用流处理来分析科学仪器,传感器和模拟的数据源。在本文中,我们研究了光源实验的流和数据处理要求,该实验预计将在不久的将来以20 GB /秒生成数据。由于用户可用于用户的横费时间通常是短的,因此必须在流模式下进行处理和分析。流媒体应用程序的开发和部署是一个复杂的任务,需要以不同语言和抽象编写的异构,分布式基础架构,框架,中间件和应用程序组件的集成。由于因素,例如可变数据速率,网络拥塞和应用特定特征,例如自适应采样技术和不同的处理技术,流媒体应用可能是极其动态的。因此,流系统通常受到需要额外基础设施来减轻这些问题的后压和不稳定性的影响。我们提出了试验流,一个支持流媒体应用的框架及其对HPC基础设施的资源管理需求。基础导频流是一种统一的架构,其使重要的问题和函数解耦,例如消息经纪,传输和通信和处理。导频流简化了流处理框架的部署,例如Kafka和Spark流,同时为管理流基础设施提供高级抽象,例如,为管理流基础架构。在运行时根据应用程序添加/删除资源。这种能力对于平衡复杂的流管道来说至关重要。为了解决流媒体应用程序的开发中的复杂性,我们提供了流式迷你应用程序,该应用程序支持不同的可插拔算法,用于数据生成和处理,e。 G.,用于使用不同技术重建光源图像。我们

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号