首页> 中文期刊> 《计算机应用与软件》 >可扩展的流数据Join处理框架

可扩展的流数据Join处理框架

         

摘要

在流数据查询过程中,join操作非常重要.单个输入流对上常有多流查询,导致了并发的数据join任务.这造成了更久的join有效时间(join window)和更大的数据流输入率,使得join操作的工作量增加.我们迫切需要一个通用(用途无关)且能高效处理多并发join任务的流数据处理机制.为此提出一个可扩展的流数据join处理框架S2J,此框架采用了数据流导向的处理模型,并将整个join操作分解为适当个数的串联的join处理单元,同时采用基于元组块的信息传输协议减少信息传输中的过载现象.该框架能有效处理θ-join,并保证join操作的实时性和结果完整性.大量实验证明了该框架的高效性和有效性.%Join operation is very important for stream query processing.Multiple stream queries were often posed on a single input stream pair,which led to the concurrent data join task.Consequently,the workload of join operations is increased,with larger join window and higher stream input rates.We urgently need a generic (purpose-independent) stream processing mechanism that efficiently handles multiple concurrent join tasks.To achieve this goal,in this paper we proposed S2J,a scalable stream join processing framework,that adopted a dataflow-oriented processing model,to perform each join task by distributing the load to an appropriate number of chained join workers and employing a tuple-block-based message passing protocol to reduce the communication overhead.This framework was efficient for theta-join,and provided real-time and result-integrity guarantees for the join processing.A large number of experiments had proved the efficiency and effectiveness of this framework.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号