首页> 外文会议>International conference on machine learning and data mining >Efficient Mining of Weighted Frequent Itemsets in Uncertain Databases
【24h】

Efficient Mining of Weighted Frequent Itemsets in Uncertain Databases

机译:不确定数据库中加权频繁项集的有效挖掘

获取原文
获取外文期刊封面目录资料

摘要

Frequent itemset mining (FIM) is a fundamental set of techniques used to discover useful and meaningful relationships between items in transaction databases. Recently, extensions of FIM such as weighted frequent itemset mining (WFIM) and frequent itemset mining in uncertain databases (UFIM) have been proposed. WFIM considers that items may have different weight/importance, and the UFIM takes into account that data collected in a real-life environment may often be inaccurate, imprecise, or incomplete. Recently, a two-phase Apriori-based approach called HEWI-Uapriori was proposed to consider both item weight and uncertainty to mine the high expected weighted item-sets (HEWIs), while it generates a large amount of candidates and is too time-consuming. In this paper, a more efficient algorithm named HEWI-Utree is developed to efficiently mine HEWIs without performing multiple database scans and without generating enormous candidates. It relies on three novel structures named element (E)-table, weighted-probability (WP)-table and WP-tree to maintain the information required for identifying and pruning unpromising itemsets early. Experimental results show that the proposed algorithm is efficient than traditional methods of WFIM and UFIM, as well as the HEWI-Uapriori algorithm, in terms of runtime, memory usage, and scalability.
机译:频繁项集挖掘(FIM)是一组基本技术,用于发现交易数据库中项之间的有用和有意义的关系。最近,已经提出了FIM的扩展,例如加权频繁项集挖掘(WFIM)和不确定数据库中的频繁项集挖掘(UFIM)。 WFIM认为物品的重量/重要性可能不同,UFIM考虑到在现实环境中收集的数据通常可能不准确,不准确或不完整。最近,提出了一种基于两阶段的基于Apriori的方法,称为HEWI-Uapriori,该方法同时考虑了项目权重和不确定性,以挖掘高期望加权项目集(HEWI),尽管它产生了大量的候选对象并且太耗时。在本文中,开发了一种更高效的算法HEWI-Utree,可以高效地挖掘HEWI,而无需执行多次数据库扫描,也不会生成大量候选对象。它依赖于三个新颖的结构,称为元素(E)-表,加权概率(WP)-表和WP-树,以维护所需的信息,以便尽早识别和修剪无前景的项目集。实验结果表明,该算法在运行时间,内存使用和可扩展性方面均优于传统的WFIM和UFIM方法以及HEWI-Uapriori算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号