首页> 外文OA文献 >Near closed frequent itemsets to accelerate the generation of association rules in a data stream environment
【2h】

Near closed frequent itemsets to accelerate the generation of association rules in a data stream environment

机译:近闭频繁项集,用于加速数据流环境中关联规则的生成

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The subject of this research is mining data stream. It is one of the most challenging and widely researched areas in Knowledge Discovery and Data Mining (KDD). A data stream is a continuous, voluminous, and unpredictable flow of data which occurs in many application domains. In a previous study, Data Stream Mining (DSM) algorithm was proposed to overcome these problems on association rules mining. It was built using various techniques such as closed frequent itemsets, tree data structures, itemsets pruning, and statistical sampling. We have developed Near Closed Nodes algorithms, which can be applied to algorithms for mining association rules that utilised closed itemsets structure. In this study, we look into the characteristics of closed frequent itemsets and propose a novel concept called Near Closed Nodes (NCN). This concept was thoroughly explored and later developed in conjunction with an existing DSM algorithm. By incorporating NCN into the DSM algorithm, we were able to increase the performance of both speed and memory usage. A comprehensive experimental study was performed to compare the performance of DSM and DSM-NCN using both simulated and real world datasets. Based on the results from the experimental study, we concluded that DSM-NCN outperformed DSM in most circumstances, especially when the datasets were dense.
机译:这项研究的主题是挖掘数据流。它是知识发现和数据挖掘(KDD)中最具挑战性和研究最多的领域之一。数据流是在许多应用程序域中发生的连续,大量且不可预测的数据流。在先前的研究中,提出了数据流挖掘(DSM)算法来克服关联规则挖掘中的这些问题。它是使用各种技术构建的,例如封闭的频繁项目集,树数据结构,项目集修剪和统计抽样。我们已经开发了“近封闭节点”算法,该算法可以应用于挖掘利用封闭项目集结构的关联规则的算法。在这项研究中,我们研究了封闭频繁项集的特征,并提出了一种称为近封闭节点(NCN)的新颖概念。对该概念进行了彻底的探索,并随后与现有的DSM算法一起进行了开发。通过将NCN合并到DSM算法中,我们能够提高速度和内存使用率的性能。进行了全面的实验研究,以使用模拟和真实数据集比较DSM和DSM-NCN的性能。根据实验研究的结果,我们得出结论,在大多数情况下,尤其是在数据集密集时,DSM-NCN的表现优于DSM。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号