A distributed evolutionary multivariate discretizer for Big Data processing on Apache Spark

Ramirez-Gallego S.; Garcia S.; Benitez J. M.; Herrera F.

首页> 外文期刊>Swarm and Evolutionary Computation >A distributed evolutionary multivariate discretizer for Big Data processing on Apache Spark

【24h】

A distributed evolutionary multivariate discretizer for Big Data processing on Apache Spark

机译：用于Apache Spark的大数据处理的分布式进化多变量分离器

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Nowadays the phenomenon of Big Data is overwhelming our capacity to extract relevant knowledge through classical machine learning techniques. Discretization (as part of data reduction) is presented as a real solution to reduce this complexity. However, standard discretizers are not designed to perform well with such amounts of data. This paper proposes a distributed discretization algorithm for Big Data analytics based on evolutionary optimization. After comparing with a distributed discretizer based on the Minimum Description Length Principle, we have found that our solution yields more accurate and simpler solutions in reasonable time.

机译：如今，大数据的现象是通过古典机器学习技术来提取相关知识的能力。离散化（作为数据减少的一部分）作为一个真实解决方案，以降低这种复杂性。但是，标准自行设定者并不设计用于使用这种数据进行良好。本文提出了一种基于进化优化的大数据分析分布式离散化算法。在与基于最小描述长度原理的分布式分离器比较后，我们发现我们的解决方案在合理的时间内产生更准确和更简单的解决方案。

著录项

来源
《Swarm and Evolutionary Computation 》 |2018年第2018期| 共11页
作者
Ramirez-Gallego S.; Garcia S.; Benitez J. M.; Herrera F.;
展开▼
作者单位

King Abdulaziz Univ Fac Comp &

Informat Technol North Jeddah Saudi Arabia;

Univ Granada CITIC UGR Dept Comp Sci &

Artificial Intelligence E-18071 Granada Spain;

King Abdulaziz Univ Fac Comp &

Informat Technol North Jeddah Saudi Arabia;

King Abdulaziz Univ Fac Comp &

Informat Technol North Jeddah Saudi Arabia;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术 ;
关键词
Discretizacion; Evolutionary computation; Big Data; Data Mining; Apache Spark;

机译：自由升级;进化计算;大数据;数据挖掘;Apache Spark;

相似文献

外文文献
中文文献
专利

1. A distributed evolutionary multivariate discretizer for Big Data processing on Apache Spark [J] . Ramirez-Gallego S., Garcia S., Benitez J. M., Swarm and Evolutionary Computation . 2018 ,第期

机译：用于Apache Spark的大数据处理的分布式进化多变量分离器
2. Big Data Processing with Apache Spark in Tertiary Institutions: Spark Streaming [J] . Emmanuel Boachie, Chunlin Li Journal of Information Engineering and Applications . 2017 ,第6期

机译：高校使用Apache Spark进行大数据处理：Spark流
3. Learning distributed discrete Bayesian Network Classifiers under MapReduce with Apache Spark [J] . Arias Jacinto, Gamez Jose A., Puerta Jose M. Knowledge-Based Systems . 2017 ,第FEBa期

机译：使用Apache Spark在MapReduce下学习分布式离散贝叶斯网络分类器
4. Distributed Entropy Minimization Discretizer for Big Data Analysis under Apache Spark [C] . Ramirez-Gallego Sergio, Garcia Salvador, Mourino-Talin Hector, IEEE International Conference on Trust, Security and Privacy in Computing and Communications;IEEE International Conference on Big Data Science and Engineering;IEEE International Symposium on Parallel and Distributed Processing with Applications . 2015

机译：Apache Spark下用于大数据分析的分布式熵最小化离散器
5. Streamlining Big Data Processing Pipelines via Unix Memory Tools, Persistent Spark Datasets, and the Apache Ignite Inmemory File System [D] . Blair, Walter 2018

机译：通过Unix内存工具，持久性Spark数据集和Apache Ignite内存文件系统简化大数据处理管道
6. Big Data Approaches for the Analysis of Large-Scale fMRI Data Using Apache Spark and GPU Processing: A Demonstration on Resting-State fMRI Data from the Human Connectome Project [O] . Roland N. Boubela, Klaudius Kalcher, Wolfgang Huf, 2015

机译：使用Apache Spark和GPU处理的大数据分析方法用于大规模fMRI数据：来自人类Connectome项目的静态fMRI数据的演示
7. Fine-tuning Resource Allocation of Apache Spark Distributed Multinode Cluster for Faster Processing of Network-trace Data [O] . Shyamasundar L B, V Anilkumar, Jhansi Rani 2019

机译：Apache Spark分布式多边码集群的微调资源分配，用于更快地处理网络跟踪数据

A distributed evolutionary multivariate discretizer for Big Data processing on Apache Spark

摘要

著录项

相似文献

相关主题

期刊订阅