Massive data mining based on item sequence set grid space

机译：基于项目序列集网格空间的海量数据挖掘

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

According to the stored mode of massive data in the relational database, this paper proposed a fast mining algorithm to find maximum frequent item sets based on item sequence set grid space. The traditional methods for mining association rules generate frequent item sets from small to large. These approaches are either time consuming or computationally expensive, and often generate a large number of redundant candidates or frequent item sets, which is fatal for controlling mining speed as data to mass-level. The goal of this paper is first to use a self-defined structure linked list to storage item sequence then to find the frequent item sets from large to small. Several applications of association rules mining using item sequence set grid space has a good performance but it demonstrated inefficiency in massive data mining. The problem involves time spent on sub item sets finding. Experimental results will be presented to show that the fast mining algorithm ISSDL-DM proposed in this paper use much less time than the similar existing algorithm ISS-DM for achieving the same outcomes.

机译：根据关系数据库中海量数据的存储方式，提出一种基于项目序列集网格空间的最大频繁项目集快速挖掘算法。挖掘关联规则的传统方法会生成从小到大的频繁项目集。这些方法既耗时又计算量大，并且经常生成大量的冗余候选或频繁项集，这对于将挖掘速度控制为大规模数据来说是致命的。本文的目标是首先使用自定义结构的链表存储项目序列，然后从大到小找到频繁的项目集。使用项目序列集网格空间进行关联规则挖掘的几种应用具有良好的性能，但在大规模数据挖掘中却表现出低效率。问题涉及花费在子项目集查找上的时间。实验结果将表明，本文提出的快速挖掘算法ISSDL-DM比相同的现有算法ISS-DM使用更少的时间来获得相同的结果。

著录项

来源
《2010 2nd International Asia Conference on Informatics in Control, Automation and Robotics (CAR 2010)》|2010年|P.208-211|共4页
会议地点 Wuhan(CN);Wuhan(CN)
作者
Lijuan Zhou; Zhang Zhang; Mingsheng Xu;
展开▼
作者单位

Inf. Eng. Coll., Capital Normal Univ., Beijing, China;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化技术及设备;
关键词
data structure; item sequence set grid space; massive data mining; maximum frequent item;

机译：数据结构;项目序列集网格空间;大量数据挖掘;最大频繁项;

相似文献

外文文献
中文文献
专利

1. An Efficient Closed Frequent Item Sets Mining Algorithm-For Mining Closed Frequent Item Sets from Data Streams [J] . Kuthadi Venu Madhav, Selvaraj Rajalakshmi Journal of computational and theoretical nanoscience . 2016,第10期

机译：有效的封闭频繁项目设置挖掘算法 - 用于挖掘数据流的闭合频繁项目集
2. High Performance Computation of Big Data: Performance Optimization Approach towards a Parallel Frequent Item Set Mining Algorithm for Transaction Data based on Hadoop MapReduce Framework [J] . Guru Prasad M S, Nagesh H R, Swathi Prabhu International Journal of Intelligent Systems and Applications . 2017,第1期

机译：大数据的高性能计算：基于Hadoop MapReduce框架的事务数据并行频繁项集挖掘算法的性能优化方法
3. Cluster Analysis-Based Approaches for Geospatiotemporal Data Mining of Massive Data Sets for Identification of Forest Threats [J] . Richard Tran Mills, Forrest M. Hoffman, Jitendra Kumar, Procedia Computer Science . 2011,第1期

机译：基于聚类分析的海量数据集地理时空数据挖掘方法，用于森林威胁识别
4. Massive data mining based on item sequence set grid space [C] . Lijuan Zhou, Zhang Zhang, Mingsheng Xu 2010 2nd International Asia Conference on Informatics in Control, Automation and Robotics (CAR 2010) . 2010

机译：基于项目序列集网格空间的海量数据挖掘
5. Mining emerging massive scientific sequence data using block-wise decomposition methods. [D] . Zhang, Qi. 2009

机译：使用逐块分解方法挖掘新兴的大量科学序列数据。
6. Automating Genomic Data Mining via a Sequence-based Matrix Format and Associative Rule Set [O] . Jonathan D Wren, David Johnson, Le Gruenwald 2005

机译：通过基于序列的矩阵格式和关联规则集自动进行基因组数据挖掘
7. Cluster Analysis-Based Approaches for Geospatiotemporal Data Mining of Massive Data Sets for Identification of Forest Threats [O] . Prof Satoshi Matsuoka, Richard Tran Millsa, Forrest M. Hoffmana, 2015

机译：基于聚类分析的海量数据集地理时空数据挖掘方法用于森林威胁识别
8. Cluster Analysis-Based Approaches for Geospatiotemporal Data Mining of Massive Data Sets for Identification of Forest Threats. [R] . Mills, R. T., Hoffman, F. M., Kumar, J., 2011

机译：基于聚类分析的海量数据集地理时空数据挖掘方法用于森林威胁识别。

Massive data mining based on item sequence set grid space

摘要

著录项

相似文献

相关主题

期刊订阅