Discovering Top-k Probabilistic Frequent Itemsets from Uncertain Databases

Haifeng Li; Yuejin Zhang; Ning Zhang

首页> 外文期刊>Procedia Computer Science >Discovering Top-k Probabilistic Frequent Itemsets from Uncertain Databases

【24h】

Discovering Top-k Probabilistic Frequent Itemsets from Uncertain Databases

机译：从不确定的数据库中发现Top-k概率频繁项集

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Probabilistic frequent itemset mining is to find the itemsets with support larger than the threshold with a given probabilistic confidence within an uncertain database. Nevertheless, when the threshold is smaller, the mining results will be massive, which are not easy to understand by the users. In this paper, we focus on this problem and propose a method to achieve the top-k probabilistic frequent itemsets, which, to our best knowledge, has never been addressed before. A scoring function is defined to evaluate the level of itemsets. We introduce a compacted data structure, named TopKPFITree , to maintain the mining results and some other information. Furthermore, an efficient algorithm TopKPFIM is proposed to build the TopKPFITree and get the results. Our experimental results over uncertain datasets show that our algorithm significantly outperform the Naive algorithm.

机译：概率频繁项集挖掘是在不确定的数据库中以给定的概率置信度找到支持量大于阈值的项集。然而，当阈值较小时，挖掘结果将是巨大的，用户不容易理解。在本文中，我们着眼于这个问题，并提出了一种方法来获得前k个概率频繁项集，据我们所知，这是以前从未解决过的。定义了评分功能以评估项目集的级别。我们引入一个名为TopKPFITree的压缩数据结构，以维护挖掘结果和其他一些信息。此外，提出了一种高效的TopKPFIM算法来构建TopKPFITree并获得结果。我们在不确定的数据集上的实验结果表明，我们的算法明显优于Naive算法。

著录项

来源
《Procedia Computer Science》 |2017年第1期|共9页
作者
Haifeng Li; Yuejin Zhang; Ning Zhang;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Discovering probabilistic frequent closed itemsets in uncertain database with tuple uncertainty [J] . Chen Fengjuan, Qu Wenyu, Nie Lihai, International Journal of Computer Systems Science & Engineering . 2016,第2期

机译：在元组不确定的不确定数据库中发现概率频繁关闭项集
2. Efficient weighted probabilistic frequent itemset mining in uncertain databases [J] . Li Zhiyang, Chen Fengjuan, Wu Junfeng, Expert Systems . 2021,第5期

机译：在不确定数据库中有效的加权概率频繁漏洞挖掘
3. Probabilistic maximal frequent itemset mining methods over uncertain databases [J] . Li Haifeng, Hai Mo, Zhang Ning, Intelligent data analysis . 2019,第6期

机译：概率最大频繁的项目集挖掘方法在不确定数据库中
4. Discovering probabilistic weighted frequent itemsets over uncertain data [C] . Tao You, Tingfeng Li, Chenglie Du, International Conference on Natural Computation, Fuzzy Systems and Knowledge Discovery . 2017

机译：在不确定数据上发现概率加权频繁项集
5. New algorithms for frequent sequential pattern and itemset data mining in certain and uncertain databases. [D] . Peterson, Erich Allen. 2012

机译：在某些不确定数据库中频繁进行顺序模式和项集数据挖掘的新算法。
6. DeBi: Discovering Differentially Expressed Biclusters using a Frequent Itemset Approach [O] . Akdes Serin, Martin Vingron 2011

机译：DeBi：使用频繁项集方法发现差异表达的Biclusters
7. Probabilistic Frequent Pattern Growth for Itemset Mining in Uncertain Databases [O] . Thomas Bernecker, Hans-Peter Kriegel, Matthias Renz, 2012

机译：在不确定数据库中替代项目集挖掘的概率频繁模式增长

Discovering Top-k Probabilistic Frequent Itemsets from Uncertain Databases

摘要

著录项

相似文献

相关主题

期刊订阅