A Multi-Domain Architecture for Mining Frequent Items and Itemsets from Distributed Data Streams

Eugenio Cesario; Carlo Mastroianni; Domenico Talia

首页> 外文期刊>Journal of grid computing >A Multi-Domain Architecture for Mining Frequent Items and Itemsets from Distributed Data Streams

【24h】

A Multi-Domain Architecture for Mining Frequent Items and Itemsets from Distributed Data Streams

机译：一种用于从分布式数据流中挖掘频繁项目和项目集的多域体系结构

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Real-time analysis of distributed data streams is a challenging task since it requires scalable solutions to handle streams of data that are generated very rapidly by multiple sources. This paper presents the design and the implementation of an architecture for the analysis of data streams in distributed environments. In particular, data stream analysis has been carried out for the computation of items and itemsets that exceed a frequency threshold. The mining approach is hybrid, that is, frequent items are calculated with a single pass, using a sketch algorithm, while frequent itemsets are calculated by a further multi-pass analysis. The architecture combines parallel and distributed processing to keep the pace with the rate of distributed data streams. In order to keep computation close to data, miners are distributed among the domains where data streams are generated. The paper reports the experimental results obtained with a prototype of the architecture, tested on a Grid composed of three domains each one handling a data stream.

机译：分布式数据流的实时分析是一项具有挑战性的任务，因为它需要可伸缩的解决方案来处理由多个源非常快速地生成的数据流。本文介绍了在分布式环境中分析数据流的体系结构的设计和实现。特别地，已经执行数据流分析以用于计算超过频率阈值的项目和项目集。挖掘方法是混合的，即，使用草图算法通过单遍计算频繁项，而通过进一步的多遍分析计算频繁项集。该体系结构结合了并行处理和分布式处理，以跟上分布式数据流的速率。为了使计算接近数据，将矿工分布在生成数据流的域之间。该论文报告了使用该架构原型获得的实验结果，并在由三个域组成的Grid上进行了测试，每个域都处理一个数据流。

著录项

来源
《Journal of grid computing》 |2014年第1期|共16页
作者
Eugenio Cesario; Carlo Mastroianni; Domenico Talia;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
Distributed data mining; Frequent items; Frequent itemsets; Grid; Stream mining;

机译：分布式数据挖掘;频繁项;频繁项集;网格;流挖掘;

相似文献

外文文献
中文文献
专利

1. A Multi-Domain Architecture for Mining Frequent Items and Itemsets from Distributed Data Streams [J] . Eugenio Cesario, Carlo Mastroianni, Domenico Talia Journal of grid computing . 2014,第1期

机译：一种用于从分布式数据流中挖掘频繁项目和项目集的多域体系结构
2. Frequent Itemset mining over transactional data streams using Item-Order-Tree [J] . Pramod S., O.P. Vyas International Journal on Computer Science and Engineering . 2010,第8期

机译：使用Item-Order-Tree在事务数据流上频繁进行项目集挖掘
3. EFFICIENT SUBSET-LATTICE ALGORITHMS FOR MINING CLOSED FREQUENT ITEMSETS AND MAXIMAL FREQUENT ITEMSETS IN DATA STREAMS [J] . Ye-In Chang, Chia-En Li, Wei-Hau Peng, International Journal of Electrical Engineering: Transactions of the Chinese Institute of Engineers, Series E . 2013,第2期

机译：高效的子格算法，用于挖掘数据流中的封闭频率项和最大频率项
4. A Sketch-Based Architecture for Mining Frequent Items and Itemsets from Distributed Data Streams [C] . Cesario Eugenio, Grillo Antonio, Mastroianni Carlo, 11th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing . 2011

机译：基于草图的架构，用于从分布式数据流中挖掘频繁项和项集
5. Mining Frequent Itemsets from Uncertain Data: Extensions to Constrained Mining and Stream Mining. [D] . Hao, Boyu. 2010

机译：从不确定的数据中挖掘频繁项集：约束挖掘和流挖掘的扩展。
6. Genetic Programming and Frequent Itemset Mining to Identify Feature Selection Patterns of iEEG and fMRI Epilepsy Data [O] . Otis Smart, Lauren Burrell -1

机译：遗传程序设计和频繁项集挖掘以识别iEEG和fMRI癫痫数据的特征选择模式
7. Continuous Prediction of Closed Frequent Itemsets from High speed Distributed Data Streams using Parallel Mining on Manifold Windows with Varying Size [O] . V. SiddaReddy, T.V. Rao, A.Govardhan A.Govardhan 2014

机译：使用平行挖掘在具有变化尺寸的歧管窗口上的高速分布数据流中的闭合频繁项目集的连续预测

A Multi-Domain Architecture for Mining Frequent Items and Itemsets from Distributed Data Streams

摘要

著录项

相似文献

相关主题

期刊订阅