首页> 外文会议> >The Complexity of Mining Maximal Frequent Itemsets and Maximal Frequent Patterns
【24h】

The Complexity of Mining Maximal Frequent Itemsets and Maximal Frequent Patterns

机译:最大频繁项集和最大频繁模式的挖掘复杂性

获取原文
获取原文并翻译 | 示例

摘要

Mining maximal frequent itemsets is one of the most fundamental problems in data mining. In this paper we study the complexity-theoretic aspects of maximal frequent item-set mining, from the perspective of counting the number of solutions. We present the first formal proof that the problem of counting the number of distinct maximal frequent item-sets in a database of transactions, given an arbitrary support threshold, is #P-complete, thereby providing strong theoretical evidence that the problem of mining maximal frequent itemsets is NP-hard. This result is of particular interest since the associated decision problem of checking the existence of a maximal frequent itemset is in P. We also extend our complexity analysis to other similar data mining problems dealing with complex data structures, such as sequences, trees, and graphs, which have attracted intensive research interests in recent years. Normally, in these problems a partial order among frequent patterns can be denned in such a way as to preserve the downward closure property, with maximal frequent patterns being those without any successor with respect to this partial order. We investigate several variants of these mining problems in which the patterns of interest are subsequences, subtrees, or subgraphs, and show that the associated problems of counting the number of maximal frequent patterns are all either #P-complete or #P-hard.
机译:挖掘最大频繁项集是数据挖掘中最基本的问题之一。在本文中,我们从计算解决方案数量的角度研究了最大频繁项集挖掘的复杂性理论方面。我们提供了第一个形式证明,即在给定任意支持阈值的情况下,对交易数据库中不同的最大频繁项目集的数量进行计数的问题是#P完全的,从而提供了有力的理论证据,证明了挖掘最大频繁项目集的问题项目集是NP难的。由于检查最大频繁项集的相关决策问题在P中,因此此结果特别有趣。我们还将复杂性分析扩展到处理复杂数据结构(例如序列,树和图)的其他类似数据挖掘问题近年来引起了广泛的研究兴趣。通常,在这些问题中,可以以保持向下闭合特性的方式来确定频繁模式中的部分顺序,其中最大频繁模式是相对于该部分顺序而言没有任何后继的那些。我们研究了这些挖掘问题的几种变体,其中感兴趣的模式是子序列,子树或子图,并表明计算最大频繁模式数的相关问题都是#P-complete或#P-hard。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号