首页> 外文会议>Web Information Systems and Applications Conference >A Distributed Rule Engine for Streaming Big Data
【24h】

A Distributed Rule Engine for Streaming Big Data

机译:用于流媒体大数据的分布式规则引擎

获取原文

摘要

The rules engine has been widely used in industry and academia, because it can separate the rules from the execution logic and incorporate the features of expert knowledge. With the advent of big data era, the amount of data has grown at an unprecedented rate. However, traditional rule engines based on PCs or servers are hard to handle streaming big data owing to limitation of hardware performance. The structured streaming computing framework can provide new solutions for these challenges. In this paper, we design a distributed rule engine based on Kafka and Structured Streaming (KSSRE), and propose a rule-fact matching strategy using the Spark SQL engine to support a large number of event stream inferences. KSSRE uses DataFrame to store data and inherits the load balancing, scalability and fault-tolerance mechanisms of Spark2.x. In addition, in order to remove the possible repetitive rules and optimize the matching process, we use the ternary grid model [1] for representing rules and design a scheduling model to improve the memory sharing in the matching process. The evaluation shows that KSSRE has a better performance, scalability and fault tolerance based on DBLP data sets.
机译:规则引擎已广泛应用于工业和学术界,因为它可以将规则与执行逻辑分开并纳入专家知识的功能。随着大数据时代的出现,数据量已经以前所未有的速度增长。但是,由于硬件性能的限制,基于PC或服务器的传统规则引擎很难处理流媒体大数据。结构化流计算框架可以为这些挑战提供新的解决方案。在本文中,我们根据KAFKA和结构化流(KSSRE)设计了一个分布式规则引擎,并使用Spark SQL引擎提出了规则 - 事实匹配策略,以支持大量事件流推断。 KSSRE使用DataFrame来存储数据并继承Spark2.x的负载平衡,可扩展性和容错机制。另外,为了除去可能的重复的规则和优化匹配过程中,我们使用三元网格模型[1]用于表示规则和设计一个调度模式,以提高在匹配处理中的存储器共享。评估表明,KSSRE基于DBLP数据集具有更好的性能,可伸缩性和容错性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号