首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Progressive partition miner: an efficient algorithm for mining general temporal association rules
【24h】

Progressive partition miner: an efficient algorithm for mining general temporal association rules

机译:渐进式分区矿工:挖掘一般时间关联规则的有效算法

获取原文
获取原文并翻译 | 示例

摘要

We explore a new problem of mining general temporal association rules in publication databases. In essence, a publication database is a set of transactions where each transaction T is a set of items of which each item contains an individual exhibition period. The current model of association rule mining is not able to handle the publication database due to the following fundamental problems, i.e., 1) lack of consideration of the exhibition period of each individual item and 2) lack of an equitable support counting basis for each item. To remedy this, we propose an innovative algorithm progressive-partition-miner (abbreviated as PPM) to discover general temporal association rules in a publication database. The basic idea of PPM is to first partition the publication database in light of exhibition periods of items and then progressively accumulate the occurrence count of each candidate 2-itemset based on the intrinsic partitioning characteristics. Algorithm PPM is also designed to employ a filtering threshold in each partition to early prune out those cumulatively infrequent 2-itemsets. The feature that the number of candidate 2-itemsets generated by PPM is very close to the number of frequent 2-itemsets allows us to employ the scan reduction technique to effectively reduce the number of database scans. Explicitly, the execution time of PPM is, in orders of magnitude, smaller than those required by other competitive schemes that are directly extended from existing methods. The correctness of PPM is proven and some of its theoretical properties are derived. Sensitivity analysis of various parameters is conducted to provide many insights into Algorithm PPM.
机译:我们探索了在发布数据库中挖掘一般时间关联规则的新问题。本质上,发布数据库是一组交易,其中每个交易T是一组项目,其中每个项目都包含一个单独的展示期。当前的关联规则挖掘模型由于以下基本问题而无法处理发布数据库,即:1)缺少考虑每个单独项目的展示时间,以及2)缺乏每个项目公平的支持计数基础。为了解决这个问题,我们提出了一种创新的算法渐进式挖矿器(缩写为PPM),以发现发布数据库中的一般时间关联规则。 PPM的基本思想是,首先根据项目的展示周期对出版物数据库进行分区,然后根据其固有的分区特征逐步累积每个候选2项目集的出现次数。算法PPM还设计为在每个分区中采用过滤阈值,以早日删减那些累积性不常见的2个项目集。 PPM生成的候选2个项目集的数量与频繁出现的2个项目集的数量非常接近的功能使我们能够采用扫描减少技术来有效地减少数据库扫描的数量。明确地说,PPM的执行时间要比从现有方法直接扩展的其他竞争方案所需的执行时间小几个数量级。 PPM的正确性得到了证明,并推导了其一些理论特性。进行各种参数的敏感性分析可提供对算法PPM的许多见解。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号