首页> 外文会议>IEEE International Conference on Data Mining >Fast Algorithms for Frequent Itemset Mining from Uncertain Data
【24h】

Fast Algorithms for Frequent Itemset Mining from Uncertain Data

机译:不确定数据频繁项集挖掘的快速算法

获取原文

摘要

The majority of existing data mining algorithms mine frequent item sets from precise databases. A well-known algorithm is FP-growth, which builds a compact FP-tree structure to capture important contents of the database and mines frequent item sets from the FP-tree. However, there are situations in which data are uncertain. In recent years, researchers have paid attention to frequent item set mining from uncertain databases. UFP-growth is one of the frequently cited algorithms for mining uncertain data. However, the corresponding UFP-tree structure can be large. Other tree structures for handling uncertain data may achieve compactness at the expense of looser upper bounds on expected supports. To solve this problem, we propose two compact tree structures which capture uncertain data with tighter upper bounds than existing tree structures. We also designed two algorithms that mine frequent item sets from our proposed trees. Our experimental results show the tightness of bounds to expected supports provided by these algorithms.
机译:现有的大多数数据挖掘算法都从精确的数据库中挖掘频繁的项目集。 FP-growth是一种众所周知的算法,它构建了一个紧凑的FP-tree结构来捕获数据库的重要内容并从FP-tree中挖掘频繁的项目集。但是,在某些情况下数据不确定。近年来,研究人员已经开始关注来自不确定数据库的频繁项目集挖掘。 UFP增长是用于挖掘不确定数据的常用算法之一。但是,相应的UFP树结构可能很大。用于处理不确定数据的其他树形结构可能会达到紧凑性,但代价是期望的支持上的上限越宽越好。为了解决这个问题,我们提出了两个紧凑的树结构,它们捕获不确定数据的上限比现有树结构更严格。我们还设计了两种算法,可从提议的树中挖掘频繁的项目集。我们的实验结果表明,这些算法所提供的预期支持范围的紧度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号