A Distributed Frequent Itemset Mining Algorithm for Uncertain Data

Jiaman Ding; Haibin Li; Yang Yang; Lianyin Jia; Jinguo You

首页> 外文期刊>International Journal of Performability Engineering >A Distributed Frequent Itemset Mining Algorithm for Uncertain Data

【24h】

A Distributed Frequent Itemset Mining Algorithm for Uncertain Data

机译：不确定数据的分布式频繁项目集挖掘算法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

With the rapidly expansion of big data in all domains, it has become a major research topic to improve the performance of mining frequent patterns in massive uncertain datasets in recent years. Most conventional frequent pattern mining approaches take expect, probability, or weight as one single factor of item support, and algorithms that consider both probability and weight are unable to balance execution efficiency under the circumstances of big data. Therefore, we propose a distributed frequent itemset mining algorithm for uncertain data: Dfimud. Firstly, Dfimud calculates the maximum probability weight value of 1-items and prunes the items whose value is less than the given threshold. Secondly, to reduce the times of scanning the datasets, a distributed Dfimud-tree structure inspired by FP-Tree is designed to mine frequent patterns. Finally, experiments on publicly available UCI datasets demonstrate that Dfimud achieves more optimal results than other related approaches across various metrics. In addition, the empirical study also shows that Dfimud has good scalability.

机译：随着所有领域的大数据迅速扩展，它已成为提高近年来大规模不确定数据集在大规模不确定数据集中采矿频繁模式的性能的主要研究课题。大多数常规频繁的挖掘方法采用预期，概率或重量作为项目支持的单一因素，以及考虑概率和重量的算法在大数据情况下无法平衡执行效率。因此，我们提出了一种用于不确定数据的分布式频繁项目集挖掘算法：Dfimud。首先，Dfimud计算1项的最大概率权重值，并将其值略小于给定阈值的项目。其次，为了减少扫描数据集的时间，由FP-Tree启发的分布式DFimud树结构被设计为频繁的模式。最后，公开的UCI数据集上的实验表明，DFimud比各种度量的其他相关方法实现了更优化的结果。此外，实证研究还表明，Dfimud具有良好的可扩展性。

著录项

来源
《International Journal of Performability Engineering》 |2019年第10期|共12页
作者
Jiaman Ding; Haibin Li; Yang Yang; Lianyin Jia; Jinguo You;
展开▼
作者单位

Faculty of Information Engineering and Automation Kunming University of Science and Technology;

Faculty of Information Engineering and Automation Kunming University of Science and Technology;

Faculty of Information Engineering and Automation Kunming University of Science and Technology;

Faculty of Information Engineering and Automation Kunming University of Science and Technology;

Faculty of Information Engineering and Automation Kunming University of Science and Technology;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类工程设计与测绘;
关键词
Data mining; Uncertain data; Frequent itemset; Distributed framework;

机译：数据挖掘;不确定的数据;频繁的项目集;分布式框架;

相似文献

外文文献
中文文献
专利

1. A Distributed Frequent Itemset Mining Algorithm for Uncertain Data [J] . Jiaman Ding, Haibin Li, Yang Yang, International Journal of Performability Engineering . 2019,第10期

机译：不确定数据的分布式频繁项目集挖掘算法
2. Mining constrained frequent itemsets from distributed uncertain data [J] . Alfredo Cuzzocrea, Carson Kai-Sang Leung, Richard Kyle MacKinnon Future generation computer systems . 2014,第jula期

机译：从分布式不确定数据中挖掘约束频繁项集
3. UDS-FIM: An Efficient Algorithm of Frequent Itemsets Mining over Uncertain Transaction Data Streams [J] . Le Wang, Lin Feng, Mingfei Wu Journal of software . 2014,第1期

机译：UDS-FIM：不确定交易数据流上频繁项目集挖掘的高效算法
4. Fast Algorithms for Frequent Itemset Mining from Uncertain Data [C] . Leung Carson Kai-Sang, MacKinnon Richard Kyle, Tanbeer Syed K. IEEE International Conference on Data Mining . 2014

机译：不确定数据频繁项集挖掘的快速算法
5. New algorithms for frequent sequential pattern and itemset data mining in certain and uncertain databases. [D] . Peterson, Erich Allen. 2012

机译：在某些不确定数据库中频繁进行顺序模式和项集数据挖掘的新算法。
6. Genetic Programming and Frequent Itemset Mining to Identify Feature Selection Patterns of iEEG and fMRI Epilepsy Data [O] . Otis Smart, Lauren Burrell -1

机译：遗传程序设计和频繁项集挖掘以识别iEEG和fMRI癫痫数据的特征选择模式
7. A Distributed Frequent Itemset Mining Algorithm for Uncertain Data [O] . Ding Jiaman, Li Haibin, Yang Yang, 2019

机译：不确定数据的分布式频繁项目集挖掘算法

A Distributed Frequent Itemset Mining Algorithm for Uncertain Data

摘要

著录项

相似文献

相关主题

期刊订阅