首页> 外文期刊>Procedia Computer Science >Discovering Top-k Probabilistic Frequent Itemsets from Uncertain Databases
【24h】

Discovering Top-k Probabilistic Frequent Itemsets from Uncertain Databases

机译:从不确定的数据库中发现Top-k概率频繁项集

获取原文
           

摘要

Probabilistic frequent itemset mining is to find the itemsets with support larger than the threshold with a given probabilistic confidence within an uncertain database. Nevertheless, when the threshold is smaller, the mining results will be massive, which are not easy to understand by the users. In this paper, we focus on this problem and propose a method to achieve the top-k probabilistic frequent itemsets, which, to our best knowledge, has never been addressed before. A scoring function is defined to evaluate the level of itemsets. We introduce a compacted data structure, named TopKPFITree , to maintain the mining results and some other information. Furthermore, an efficient algorithm TopKPFIM is proposed to build the TopKPFITree and get the results. Our experimental results over uncertain datasets show that our algorithm significantly outperform the Naive algorithm.
机译:概率频繁项集挖掘是在不确定的数据库中以给定的概率置信度找到支持量大于阈值的项集。然而,当阈值较小时,挖掘结果将是巨大的,用户不容易理解。在本文中,我们着眼于这个问题,并提出了一种方法来获得前k个概率频繁项集,据我们所知,这是以前从未解决过的。定义了评分功能以评估项目集的级别。我们引入一个名为TopKPFITree的压缩数据结构,以维护挖掘结果和其他一些信息。此外,提出了一种高效的TopKPFIM算法来构建TopKPFITree并获得结果。我们在不确定的数据集上的实验结果表明,我们的算法明显优于Naive算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号