Weighted frequent itemset mining over uncertain databases

Lin Jerry Chun-Wei; Gan Wensheng; Fournier-Viger Philippe; Hong Tzung-Pei; Tseng Vincent S.

首页> 外文期刊>Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies >Weighted frequent itemset mining over uncertain databases

【24h】

Weighted frequent itemset mining over uncertain databases

机译：不确定数据库上的加权频繁项集挖掘

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Frequent itemset mining (FIM) is a fundamental research topic, which consists of discovering useful and meaningful relationships between items in transaction databases. However, FIM suffers from two important limitations. First, it assumes that all items have the same importance. Second, it ignores the fact that data collected in a real-life environment is often inaccurate, imprecise, or incomplete. To address these issues and mine more useful and meaningful knowledge, the problems of weighted and uncertain itemset mining have been respectively proposed, where a user may respectively assign weights to items to specify their relative importance, and specify existential probabilities to represent uncertainty in transactions. However, no work has addressed both of these issues at the same time. In this paper, we address this important research problem by designing a new type of patterns named high expected weighted itemset (HEWI) and the HEWI-Uapriori algorithm to efficiently discover HEWIs. The HEWI-Uapriori finds HEWIs using an Apriori-like two-phase approach. The algorithm introduces a property named high upper-bound expected weighted downward closure (HUBEWDC) to early prune the search space and unpromising itemsets. Substantial experiments on real-life and synthetic datasets are conducted to evaluate the performance of the proposed algorithm in terms of runtime, memory consumption, and number of patterns found. Results show that the proposed algorithm has excellent performance and scalability compared with traditional methods for weighted-itemset mining and uncertain itemset mining.

机译：频繁项集挖掘（FIM）是一项基础研究主题，包括发现交易数据库中项之间的有用和有意义的关系。但是，FIM受到两个重要限制。首先，假设所有项目都具有相同的重要性。其次，它忽略了现实环境中收集的数据通常不准确，不精确或不完整的事实。为了解决这些问题并挖掘更多有用和有意义的知识，分别提出了加权和不确定项目集挖掘的问题，用户可以分别为项目分配权重以指定其相对重要性，并指定存在概率来表示交易中的不确定性。但是，没有一项工作可以同时解决这两个问题。在本文中，我们通过设计一种称为高期望加权项目集（HEWI）的新型模式和HEWI-Uapriori算法来有效发现HEWI，从而解决了这一重要的研究问题。 HEWI-Uapriori使用类似Apriori的两阶段方法找到HEWI。该算法引入了一个名为高上限预期加权向下闭合（HUBEWDC）的属性，以早日修剪搜索空间和没有希望的项目集。进行了现实生活和综合数据集的大量实验，以在运行时间，内存消耗和找到的模式数量方面评估所提出算法的性能。结果表明，与传统的加权项集挖掘和不确定项集挖掘方法相比，该算法具有优良的性能和可扩展性。

著录项

来源
《Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies》 |2016年第1期|共19页
作者
Lin Jerry Chun-Wei; Gan Wensheng; Fournier-Viger Philippe; Hong Tzung-Pei; Tseng Vincent S.;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词
Data mining; Uncertain databases; Weighted frequent itemsets; Two-phase; Upper-bound;

机译：数据挖掘;不确定的数据库;加权频繁项集;两阶段;上限;

相似文献

外文文献
中文文献
专利

1. Efficient weighted probabilistic frequent itemset mining in uncertain databases [J] . Li Zhiyang, Chen Fengjuan, Wu Junfeng, Expert Systems . 2021,第5期

机译：在不确定数据库中有效的加权概率频繁漏洞挖掘
2. Weighted frequent itemset mining over uncertain databases [J] . Lin Jerry Chun-Wei, Gan Wensheng, Fournier-Viger Philippe, Applied Intelligence: The International Journal of Artificial Intelligence, Neural Networks, and Complex Problem-Solving Technologies . 2016,第1期

机译：不确定数据库上的加权频繁项集挖掘
3. An Efficient Method for Mining Frequent Weighted Closed Itemsets from Weighted Item Transaction Databases [J] . Bay Vo Journal of Information Recording . 2017,第1期

机译：一种从加权项目交易数据库中挖掘频繁的加权封闭项目集的有效方法
4. Efficient Mining of Weighted Frequent Itemsets in Uncertain Databases [C] . Jerry Chun-Wei Lin, Wensheng Gan, Philippe Fournier-Viger, International conference on machine learning and data mining . 2016

机译：不确定数据库中加权频繁项集的有效挖掘
5. New algorithms for frequent sequential pattern and itemset data mining in certain and uncertain databases. [D] . Peterson, Erich Allen. 2012

机译：在某些不确定数据库中频繁进行顺序模式和项集数据挖掘的新算法。
6. Unravelling associations between unassigned mass spectrometry peaks with frequent itemset mining techniques [O] . Trung Nghia Vu, Aida Mrzic, Dirk Valkenborg, 2014

机译：利用频繁项集挖掘技术揭示未分配质谱峰之间的关联
7. Mining frequent itemsets over uncertain databases [O] . Yongxin Tong, Lei Chen, Yurong Cheng, 2016

机译：在不确定数据库上挖掘频繁项集

Weighted frequent itemset mining over uncertain databases

摘要

著录项

相似文献

相关主题

期刊订阅