【24h】

Sampling-based sequential subgroup mining

机译:基于采样的顺序子组挖掘

获取原文

摘要

Subgroup discovery is a learning task that aims at finding interesting rules from classified examples. The search is guided by a utility function, trading off the coverage of rules against their statistical unusualness. One shortcoming of existing approaches is that they do not incorporate prior knowledge. To this end a novel generic sampling strategy is proposed. It allows to turn pattern mining into an iterative process. In each iteration the focus of subgroup discovery lies on those patterns that are unexpected with respect to prior knowledge and previously discovered patterns. The result of this technique is a small diverse set of understandable rules that characterise a specified property of interest. As another contribution this article derives a simple connection between subgroup discovery and classifier induction. For a popular utility function this connection allows to apply any standard rule induction algorithm to the task of subgroup discovery after a step of stratified resampling. Theproposed techniques are empirically compared to state of the art subgroup discovery algorithms.
机译:小组发现是一项学习任务,旨在从分类示例中找到有趣的规则。搜索以效用函数为指导,以权衡规则的覆盖范围和统计上的异常性为代价。现有方法的一个缺点是它们没有合并先验知识。为此,提出了一种新颖的通用采样策略。它允许将模式挖掘转换为迭代过程。在每次迭代中,子组发现的重点都在于那些相对于先验知识和先前发现的模式而言出乎意料的模式。这项技术的结果是形成了一组可理解的小规则,这些规则描述了指定的感兴趣属性。作为另一贡献,本文得出了亚组发现与分类器归纳之间的简单联系。对于流行的实用程序功能,此连接允许在分层重采样步骤之后将任何标准规则归纳算法应用于子组发现任务。将所提议的技术与现有技术的子组发现算法进行经验比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号