首页> 外文期刊>Intelligent data analysis >On pushing weight constraints deeply into frequent itemset mining
【24h】

On pushing weight constraints deeply into frequent itemset mining

机译:关于将重量约束深推到频繁的项目集挖掘中

获取原文
获取原文并翻译 | 示例
           

摘要

There have been many studies on mining frequent itemset (or pattern) in the data mining field because of its broad applications in mining association rules, correlations, graph patterns, constraint based frequent patterns, sequential patterns, and many other data mining tasks. One of major challenges in frequent pattern mining is a huge number of result patterns. As the minimum threshold becomes lower, an exponentially large number of itemsets are generated. Therefore, pruning unimportant patterns effectively in mining process is one of main topics in frequent pattern mining. In weighted frequent pattern mining, not only support but also weight are used and important patterns can be detected. In this paper, we propose two efficient algorithms for mining weighted frequent itemsets in which the main approaches are to push weight constraints into the Apriori algorithm and the pattern growth algorithm respectively. Additionally, we show how to maintain the downward closure property in mining weighted frequent itemsets. In our approach, the normalized weights within the weight range are used according to the importance of items. A weight range is used to restrict weights of items and a minimum weight is utilized to balance between weight and support of items for pruning the search space. Our approach generates fewer but important weighted frequent itemsets in large databases, particularly dense databases with low minimum supports. An extensive performance study shows that our algorithm outperforms previous mining algorithms. In addition, it is efficient and scalable.
机译:由于数据挖掘在关联规则,相关性,图形模式,基于约束的频繁模式,顺序模式以及许多其他数据挖掘任务中的广泛应用,因此在数据挖掘领域对频繁项集(或模式)的挖掘已有许多研究。频繁模式挖掘中的主要挑战之一是数量众多的结果模式。随着最小阈值变得更低,将生成指数级数量的项目集。因此,在频繁的模式挖掘中,有效地修剪不重要的模式是采矿过程中的主要主题之一。在加权频繁模式挖掘中,不仅使用支持,而且使用权重,并且可以检测到重要的模式。本文提出了两种有效的加权频繁项集挖掘算法,其主要方法是将权重约束分别推入Apriori算法和模式增长算法。此外,我们展示了如何在加权加权频繁项集的挖掘中保持向下关闭属性。在我们的方法中,根据物品的重要性使用重量范围内的归一化重量。权重范围用于限制项目的权重,最小权重用于在项目的权重和支持之间平衡,以修剪搜索空间。我们的方法在大型数据库(特别是最小支持量较低的密集型数据库)中生成较少但重要的加权频繁项目集。广泛的性能研究表明,我们的算法优于以前的挖掘算法。此外,它是高效且可扩展的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号