首页> 中文期刊>燕山大学学报 >一种基于频繁模式有向无环图的数据流频繁模式挖掘算法

一种基于频繁模式有向无环图的数据流频繁模式挖掘算法

     

摘要

The algorithms based on FP-growth for mining frequent pattems need to scan the transaction database twice and give the threshold in advance, it also can not support time sensitive data stream.In this paper an algorithm based on frequent pattem directed acyclic graph far mining frequent pattems from data stream is proposed.Each transaction is given a number according to its coming time.Items contained in a transaction are sorted by their order, and the directed acyclic graph follows that order.A frequent pattem directed acyclic graph records the relation between transactions and items by its number.The process of pattem growth is to increase the transaction number of the directed edges.The atgonthm travels edges conversely by the same transaction number and products the conditional pattem bases, it can abstract the information of the conditional pattern bases according to dynamic threshold and scan the database only once to gain frequent patterns.The experiment shows that the proposed algorithm is better than FP-growth in execute time and the number of the storaged node is significiantly reduced.%频繁模式挖掘中基于FP-growth的算法需要扫描两次事务数据库,预先给定支持度,且不支持时间敏感型数据.本文提出了一种基于频繁模式有向无环图的数据流频繁模式挖掘算法,它根据事务到来的时间给每个事务一个序号,每个事务中的数据项在存储前按数据项的顺序进行调整,频繁模式有向无环图的构建遵循这个顺序并用序号来记录事务与数据项的包含关系,模式增长过程只需要增加有向边上的序号.通过逆向遍历带有相同序号的有向边,产生条件模式基,根据动态定义的阈值抽取条件模式基信息,一次扫描数据库得到频繁模式.实验结果表明,本文算法的执行效率优于FP-growth算法,且存储节点的数目明显减少.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号