首页> 外文期刊>Distributed and Parallel Databases >Enabling efficient process mining on large data sets: realizing an in-database process mining operator
【24h】

Enabling efficient process mining on large data sets: realizing an in-database process mining operator

机译:在大型数据集上实现高效的流程挖掘:实现数据库内流程挖掘运算符

获取原文
获取原文并翻译 | 示例

摘要

Process mining can be used to analyze business processes based on logs of their execution. These execution logs are often obtained by querying a database and storing the results in a file. The mining itself is then done on the file, such that the data processing power of the database cannot be used after the log is extracted. Enabling process mining directly on a database therefore provides additional flexibility and efficiency. To help facilitate this, this paper formally defines a database operator that extracts the 'directly follows' relation-one of the relations that is at the heart of process mining-from an operational database. It defines the operator using the well-known relational algebra and formally proves equivalence properties of the operator that are useful for query optimization. Subsequently, it presents time-complexity properties of the operator. Finally, it presents an implementation of the operator as part of the H2 DBMS and demonstrates that this implementation extracts the 'directly follows' relation from a database with an arbitrary database structure within a fraction of a second; several orders of magnitude faster than is currently possible.
机译:流程挖掘可用于基于业务执行日志分析业务流程。这些执行日志通常是通过查询数据库并将结果存储在文件中获得的。然后,对文件本身进行挖掘,以使得在提取日志之后无法使用数据库的数据处理能力。因此,直接在数据库上启用进程挖掘将提供额外的灵活性和效率。为帮助实现此目的,本文正式定义了一个数据库运算符,该运算符从可操作的数据库中提取“直接跟随”关系(该关系是过程挖掘的核心之一)。它使用众所周知的关系代数定义运算符,并正式证明运算符的等价属性对查询优化很有用。随后,它显示了操作员的时间复杂性属性。最后,它介绍了作为H2 DBMS一部分的运算符的实现,并演示了该实现在不到一秒钟的时间内从具有任意数据库结构的数据库中提取了“直接跟随”关系;比目前的速度快几个数量级。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号