首页> 外文会议>IEEE International Congress on Big Data >Distributed Adaptive Model Rules for mining big data streams

【24h】

Distributed Adaptive Model Rules for mining big data streams

机译：用于挖掘大数据流的分布式自适应模型规则

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Decision rules are among the most expressive data mining models. We propose the first distributed streaming algorithm to learn decision rules for regression tasks. The algorithm is available in SAMOA (Scalable Advanced Massive Online Analysis), an open-source platform for mining big data streams. It uses a hybrid of vertical and horizontal parallelism to distribute Adaptive Model Rules (AMRules) on a cluster. The decision rules built by AMRules are comprehensible models, where the antecedent of a rule is a conjunction of conditions on the attribute values, and the consequent is a linear combination of the attributes. Our evaluation shows that this implementation is scalable in relation to CPU and memory consumption. On a small commodity Samza cluster of 9 nodes, it can handle a rate of more than 30000 instances per second, and achieve a speedup of up to 4.7x over the sequential version.

机译：决策规则是最具表现力的数据挖掘模型之一。我们提出了第一种分布式流算法来学习回归任务的决策规则。该算法在SAMOA（可扩展高级大规模在线分析）中可用，SAMOA是用于挖掘大数据流的开源平台。它使用垂直和水平并行度的混合来在群集上分布自适应模型规则（AMRules）。 AMRules构建的决策规则是可理解的模型，其中规则的前提是属性值上条件的结合，因此是属性的线性组合。我们的评估表明，该实现相对于CPU和内存消耗具有可伸缩性。在一个由9个节点组成的小型商品Samza集群上，它每秒可以处理30000个实例以上的速率，并且与顺序版本相比，最高可加快4.7倍的速度。

著录项

来源
《IEEE International Congress on Big Data 》|2014年|345-353|共9页
会议地点
作者
Anh Thu Vu; De Francisci Morales Gianmarco; Gama Joao; Bifet Albert;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Big Data; data mining; public domain software; SAMOA; Samza cluster; big data stream mining; decision rules; distributed AMRules; distributed adaptive model rules; distributed streaming algorithm; expressive data mining models; open-source platform; scalable advanced massive online analysis; Adaptation models; Data mining; Data models; Heat-assisted magnetic recording; Machine learning algorithms; Parallel processing; Predictive models;

机译：大数据;数据挖掘;公共领域软件; SAMOA; Samza集群;大数据流挖掘;决策规则;分布式AMRules;分布式自适应模型规则;分布式流算法;表达性数据挖掘模型;开源平台;可扩展的高级大规模在线分析适应模型数据挖掘数据模型热辅助磁记录机器学习算法并行处理预测模型;

相似文献

外文文献
中文文献
专利

1. Research on distributed data stream mining algorithms based on matrix weighted association rules [J] . Xu Dong Web Intelligence . 2020 ,第2期

机译：基于矩阵加权关联规则的分布式数据流挖掘算法研究
2. An Optimized Distributed Association Rule Mining Algorithm in Parallel and Distributed Data Mining with XML Data for Improved Response Time [J] . Sujni Paul International Journal of Computer Science & Information Technology (IJCSIT) . 2010 ,第2期

机译：XML数据并行和分布式数据挖掘中的优化分布式关联规则挖掘算法，可提高响应时间
3. Adaptive Fuzzy Clustering of Short Time Series with Unevenly Distributed Observations in Data Stream Mining Tasks [J] . Yevgeniy Bodyanskiy, Olena Vynokurova, Ilya Kobylin, Information Technology and Management Science . 2016 ,第1期

机译：数据流挖掘任务中具有不均匀分布观测值的短时间序列的自适应模糊聚类
4. Distributed Adaptive Model Rules for mining big data streams [C] . Anh Thu Vu, De Francisci Morales Gianmarco, Gama Joao, IEEE International Congress on Big Data . 2014

机译：挖掘大数据流的分布式自适应模型规则
5. Association rule based data mining approaches for Web Cache Maintenance and adaptive Intrusion Detection systems. [D] . Mohan, Sujaa Rani. 2005

机译：Web缓存维护和自适应入侵检测系统的基于关联规则的数据挖掘方法。
6. Fast Adapting Ensemble: A New Algorithm for Mining Data Streams with Concept Drift [O] . Agustín Ortíz Díaz, José del Campo-Ávila, Gonzalo Ramos-Jiménez, 2015

机译：快速适应的集成体：一种使用概念漂移挖掘数据流的新算法
7. Enhancing association rules algorithms for mining distributed databases. Integration of fast BitTable and multi-agent association rules mining in distributed medical databases for decision support. [O] . Abdo Walid Adly Atteya 2012

机译：增强用于挖掘分布式数据库的关联规则算法。快速BitTable和多代理关联规则挖掘在分布式医疗数据库中的集成，以提供决策支持。

Distributed Adaptive Model Rules for mining big data streams

摘要

著录项

相似文献

相关主题

期刊订阅