首页> 外文会议>Trends and applications in knowledge discovery and data mining >Mining Correlated Patterns with Multiple Minimum All-Confidence Thresholds
【24h】

Mining Correlated Patterns with Multiple Minimum All-Confidence Thresholds

机译:具有多个最小置信度阈值的关联模式的挖掘

获取原文
获取原文并翻译 | 示例

摘要

Correlated patterns are an important class of regularities that exist in a database. The all-confidence measure has been widely used to discover the patterns in real-world applications. This paper theoretically analyzes the all-confidence measure, and shows that, although the measure satisfies the null-invariant property, mining correlated patterns involving both frequent and rare items with a single minimum all-confidence (minAllConf) threshold value causes the "rare item problem" if the items' frequencies in a database vary widely. The problem involves either finding very short length correlated patterns involving rare items at a high minAllConf threshold, or generating a huge number of patterns at a low minAllConf threshold. The cause for the problem is that the single minAllConf threshold was not sufficient to capture the items' frequencies in a database effectively. The paper also introduces an alternative model of correlated patterns using the concept of multiple minAllConf thresholds. The proposed model facilitates the user to specify a different minAllConf threshold for each pattern to reflect the varied frequencies of items within it. Experiment results show that the proposed model is very effective.
机译:关联模式是数据库中存在的一类重要的规则。完全置信度已被广泛用于发现实际应用中的模式。本文从理论上分析了所有置信度度量,并表明,尽管该度量满足零不变性,但是挖掘涉及频繁和稀有项且具有单个最小全部置信度(minAllConf)阈值的相关模式会导致“稀有项”问题”,如果数据库中项目的频率差异很大。问题涉及要么在minAllConf阈值高的情况下找到涉及稀有项的非常短的长度相关模式,要么在minAllConf阈值低的情况下生成大量模式。问题的原因在于,单个minAllConf阈值不足以有效地捕获数据库中项目的频率。本文还介绍了使用多个minAllConf阈值的概念的相关模式的替代模型。所提出的模型有助于用户为每个模式指定不同的minAllConf阈值,以反映其中的项的变化频率。实验结果表明,该模型是非常有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号