首页> 外文期刊>Software and systems modeling >Process mining: a two-step approach to balance between underfitting and overfitting
【24h】

Process mining: a two-step approach to balance between underfitting and overfitting

机译:流程挖掘:分两步走在平衡与过度拟合之间取得平衡

获取原文
获取原文并翻译 | 示例
       

摘要

Process mining includes the automated discovery of processes from event logs. Based on observed events (e.g., activities being executed or messages being exchanged) a process model is constructed. One of the essential problems in process mining is that one cannot assume to have seen all possible behavior. At best, one has seen a representative subset. Therefore, classical synthesis techniques are not suitable as they aim at finding a model that is able to exactly reproduce the log. Existing process mining techniques try to avoid such "overfitting" by generalizing the model to allow for more behavior. This generalization is often driven by the representation language and very crude assumptions about completeness. As a result, parts of the model are "overfitting" (allow only for what has actually been observed) while other partsrnmay be "underfitting" (allow for much more behavior without strong support for it). None of the existing techniques enables the user to control the balance between "overfitting" and "underfitting". To address this, we propose a two-step approach. First, using a configurable approach, a transition system is constructed. Then, using the "theory of regions", the model is synthesized. The approach has been implemented in the context of ProM and overcomes many of the limitations of traditional approaches.
机译:流程挖掘包括从事件日志中自动发现流程。基于观察到的事件(例如,正在执行的活动或正在交换的消息),构建过程模型。流程挖掘中的基本问题之一是无法假设已经看到所有可能的行为。充其量,只能看到一个有代表性的子集。因此,经典的合成技术不适合使用,因为它们的目的是寻找能够准确再现测井曲线的模型。现有的过程挖掘技术通过对模型进行泛化以允许更多行为来尝试避免这种“过拟合”。这种概括通常是由表示语言和关于完整性的非常粗略的假设驱动的。结果,模型的某些部分“过度拟合”(仅允许实际观察到),而其他部分可能“欠拟合”(允许更多行为而又没有强大的支持)。现有技术均无法使用户控制“过拟合”和“欠拟合”之间的平衡。为了解决这个问题,我们提出了两步法。首先,使用可配置的方法,构建过渡系统。然后,使用“区域理论”合成模型。该方法已在ProM的背景下实施,并克服了传统方法的许多局限性。

著录项

  • 来源
    《Software and systems modeling》 |2010年第1期|87-111|共25页
  • 作者单位

    Eindhoven University of Technology, P.O. Box 513, 5600 MB, Eindhoven, The Netherlands;

    rnSoftware Design and Management (sd&m AG), Offenbach am Main, Germany;

    rnEindhoven University of Technology, P.O. Box 513, 5600 MB, Eindhoven, The Netherlands;

    rnEindhoven University of Technology, P.O. Box 513, 5600 MB, Eindhoven, The Netherlands;

    rnTechnical University of Denmark, Informatics and Mathematical Modelling, Lyngby, Denmark;

    rnEindhoven University of Technology, P.O. Box 513, 5600 MB, Eindhoven, The Netherlands;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号