首页> 外文期刊>Data mining and knowledge discovery >Flexible constrained sampling with guarantees for pattern mining
【24h】

Flexible constrained sampling with guarantees for pattern mining

机译:灵活约束采样,具有模式挖掘的保证

获取原文
获取原文并翻译 | 示例
获取外文期刊封面目录资料

摘要

Pattern sampling has been proposed as a potential solution to the infamous pattern explosion. Instead of enumerating all patterns that satisfy the constraints, individual patterns are sampled proportional to a given quality measure. Several sampling algorithms have been proposed, but each of them has its limitations when it comes to (1) flexibility in terms of quality measures and constraints that can be used, and/or (2) guarantees with respect to sampling accuracy. We therefore present Flexics, the first flexible pattern sampler that supports a broad class of quality measures and constraints, while providing strong guarantees regarding sampling accuracy. To achieve this, we leverage the perspective on pattern mining as a constraint satisfaction problem and build upon the latest advances in sampling solutions in SAT as well as existing pattern mining algorithms. Furthermore, the proposed algorithm is applicable to a variety of pattern languages, which allows us to introduce and tackle the novel task of sampling sets of patterns. We introduce and empirically evaluate two variants of Flexics: (1) a generic variant that addresses the well-known itemset sampling task and the novel pattern set sampling task as well as a wide range of expressive constraints within these tasks, and (2) a specialized variant that exploits existing frequent itemset techniques to achieve substantial speed-ups. Experiments show that Flexics is both accurate and efficient, making it a useful tool for pattern-based data exploration.
机译:已经提出了模式抽样作为臭臭型模式爆炸的潜在解决方案。不是枚举满足约束的所有模式,而是与给定质量测量成比例的单独模式。已经提出了几种采样算法,但是当您在可以使用的质量措施和限制方面达到(1)灵活性时,它们中的每一个都有其限制,并且(2)关于采样精度保证的质量措施和约束。因此,我们目前的Flexics是一个支持广泛的质量措施和约束的柔性模式采样器,同时提供了关于采样精度的强烈保证。为实现这一目标,我们利用了模式挖掘作为约束满足问题的视角,并建立了SAT中的采样解决方案的最新进展,以及现有的模式挖掘算法。此外,所提出的算法适用于各种模式语言,这使我们能够引入和解决采样模式的新颖任务。我们介绍并经验评估了两个柔性型变体:(1)一种通用变体,用于解决众所周知的项目集采样任务和新颖的模式设置采样任务以及这些任务中的广泛呈现限制,以及(2)a专业变体利用现有的常见项目集技术来实现大量速度。实验表明,Flexics既准确又高效,使其成为基于模式的数据探索的有用工具。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号