【24h】

Massive data mining based on item sequence set grid space

机译:基于项目序列集网格空间的海量数据挖掘

获取原文
获取原文并翻译 | 示例

摘要

According to the stored mode of massive data in the relational database, this paper proposed a fast mining algorithm to find maximum frequent item sets based on item sequence set grid space. The traditional methods for mining association rules generate frequent item sets from small to large. These approaches are either time consuming or computationally expensive, and often generate a large number of redundant candidates or frequent item sets, which is fatal for controlling mining speed as data to mass-level. The goal of this paper is first to use a self-defined structure linked list to storage item sequence then to find the frequent item sets from large to small. Several applications of association rules mining using item sequence set grid space has a good performance but it demonstrated inefficiency in massive data mining. The problem involves time spent on sub item sets finding. Experimental results will be presented to show that the fast mining algorithm ISSDL-DM proposed in this paper use much less time than the similar existing algorithm ISS-DM for achieving the same outcomes.
机译:根据关系数据库中海量数据的存储方式,提出一种基于项目序列集网格空间的最大频繁项目集快速挖掘算法。挖掘关联规则的传统方法会生成从小到大的频繁项目集。这些方法既耗时又计算量大,并且经常生成大量的冗余候选或频繁项集,这对于将挖掘速度控制为大规模数据来说是致命的。本文的目标是首先使用自定义结构的链表存储项目序列,然后从大到小找到频繁的项目集。使用项目序列集网格空间进行关联规则挖掘的几种应用具有良好的性能,但在大规模数据挖掘中却表现出低效率。问题涉及花费在子项目集查找上的时间。实验结果将表明,本文提出的快速挖掘算法ISSDL-DM比相同的现有算法ISS-DM使用更少的时间来获得相同的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号