首页> 外文期刊>Journal of supercomputing >Dispatching stream operators in parallel execution of continuous queries
【24h】

Dispatching stream operators in parallel execution of continuous queries

机译:在并行执行连续查询中调度流运算符

获取原文
获取原文并翻译 | 示例

摘要

Data stream is a continuous, rapid, time-vary ing sequence of data elements which should be processed in an online manner. These matters are under research in Data Stream Management Systems (DSMSs). Single processor DSMSs cannot satisfy data stream applications' requirements properly. Main shortcomings are tuple latency, tuple loss, and throughput. In our previous publications, we introduced parallel execution of continuous queries to overcome these problems via performance improvement, especially in terms of tuple latency. We scheduled operators in an event-driven manner which caused system performance reduction in periods between consecutive scheduling instances. In this paper, a continuous scheduling method (dispatching) is presented to be more compatible with the continuous nature of data streams as well as queries to improve system adaptivity and performance. In a multiprocessing environment, the dispatching method forces processing nodes (logical machines) to send partially-processed tuples to next machines with minimum workload to execute the next operator on them. So, operator scheduling is done continuously and dynamically for each tuple processed by each operator. The dispatching method is described, formally presented, and its correctness is proved. Also, it is modeled in PetriNets and is evaluated via simulation. Results show that the dispatching method significantly improves system performance in terms of tuple latency, throughput, and tuple loss. Furthermore, the fluctuation of system performance parameters (against variation of system and stream characteristics) diminishes considerably and leads to high adaptivity with the underlying system.
机译:数据流是一个连续,快速,随时间变化的数据元素序列,应以在线方式进行处理。这些问题正在数据流管理系统(DSMS)中进行研究。单处理器DSMS无法正确满足数据流应用程序的要求。主要缺点是元组延迟,元组丢失和吞吐量。在以前的出版物中,我们引入了并行执行连续查询以通过性能改进来克服这些问题,特别是在元组延迟方面。我们以事件驱动的方式调度操作员,这导致连续调度实例之间的周期内系统性能下降。在本文中,提出了一种连续调度方法(调度),以与数据流的连续性以及查询更好地兼容,以提高系统的适应性和性能。在多处理环境中,调度方法强制处理节点(逻辑机器)以最少的工作量将部分处理的元组发送到下一台机器,以在其上执行下一个运算符。因此,对于每个运算符处理的每个元组,连续且动态地完成运算符调度。描述并正式提出了调度方法,并证明了其正确性。而且,它在PetriNets中建模并通过仿真进行评估。结果表明,调度方法在元组等待时间,吞吐量和元组丢失方面显着提高了系统性能。此外,系统性能参数的波动(针对系统和流特性的变化)大大减小,并导致与基础系统的高度适应性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号