...
首页> 外文期刊>Data mining and knowledge discovery >Exceptional Model Mining Supervised descriptive local pattern mining with complex target concepts
【24h】

Exceptional Model Mining Supervised descriptive local pattern mining with complex target concepts

机译:出色的模型挖掘监督具有复杂目标概念的描述性局部模式挖掘

获取原文
           

摘要

Finding subsets of a dataset that somehow deviate from the norm, i.e. where something interesting is going on, is a classical Data Mining task. In traditional local pattern mining methods, such deviations are measured in terms of a relatively high occurrence (frequent itemset mining), or an unusual distribution for one designated target attribute (common use of subgroup discovery). These, however, do not encompass all forms of "interesting". To capture a more general notion of interestingness in subsets of a dataset, we develop Exceptional Model Mining (EMM). This is a supervised local pattern mining framework, where several target attributes are selected, and a model over these targets is chosen to be the target concept. Then, we strive to find subgroups: subsets of the dataset that can be described by a few conditions on single attributes. Such subgroups are deemed interesting when the model over the targets on the subgroup is substantially different from the model on the whole dataset. For instance, we can find subgroups where two target attributes have an unusual correlation, a classifier has a deviating predictive performance, or a Bayesian network fitted on several target attributes has an exceptional structure. We give an algorithmic solution for the EMM framework, and analyze its computational complexity. We also discuss some illustrative applications of EMM instances, including using the Bayesian network model to identify meteorological conditions under which food chains are displaced, and using a regression model to find the subset of households in the Chinese province of Hunan that do not follow the general economic law of demand.
机译:查找数据集的某些子集会偏离规范,即发生了一些有趣的事情,这是一项经典的数据挖掘任务。在传统的局部模式挖掘方法中,根据相对较高的发生率(频繁项集挖掘)或一个指定目标属性的异常分布(子组发现的普遍使用)来度量此类偏差。但是,这些并不包含所有形式的“有趣”。为了捕获数据集中子集中更有趣的概念,我们开发了异常模型挖掘(EMM)。这是一个受监督的本地模式挖掘框架,其中选择了几个目标属性,并选择了针对这些目标的模型作为目标概念。然后,我们努力寻找子组:数据集的子集,这些子集可以由几个关于单个属性的条件来描述。当子组上的目标模型与整个数据集上的模型显着不同时,这些子组被认为很有趣。例如,我们可以找到两个目标属性具有异常相关性,分类器的预测性能有偏差或在多个目标属性上拟合的贝叶斯网络具有异常结构的子组。我们为EMM框架提供了一种算法解决方案,并分析了其计算复杂性。我们还讨论了EMM实例的一些示例性应用,包括使用贝叶斯网络模型确定食物链移位的气象条件,以及使用回归模型找到中国湖南省不遵循一般情况的家庭子集。经济需求规律。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号