Effective Selection of Machine Learning Algorithms for Big Data Analytics Using Apache Spark

机译：使用Apache Spark有效选择大数据分析的机器学习算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Big Data appears with not only the increasing size of data but also complex and different processing and analytical tools. This research aims to compare some selected machine learning algorithms on datasets of different types and sizes using Apache spark tool in order to make a fair judgment about which one is the best fitting in. The algorithms were compared based on few parameters including mainly accuracy and training time. The algorithms were applied on three datasets of different fields: marketing, packing and statistics, and security datasets. The findings of this experiment show that the decision tree algorithm is the most suitable algorithm for marketing and security datasets. Additionally, logistic regression algorithm had the highest accuracy for packing and statistics dataset.

机译：出现大数据不仅具有越来越大的数据大小，而且具有复杂和不同的处理和分析工具。本研究旨在使用Apache Spark工具比较不同类型的数据集上的一些选定的机器学习算法，并使用Apache Spark工具进行尺寸的尺寸，以便进行公平判断哪一个是哪一个是最适合的。基于少数参数比较了算法，包括主要是准确性和培训时间。算法应用于不同领域的三个数据集：营销，包装和统计数据和安全数据集。该实验的发现表明，决策树算法是最合适的营销和安全数据集的算法。此外，Logistic回归算法具有最高的包装和统计数据集的准确性。

著录项

来源
《International Conference on Advanced Intelligent Systems and Informatics》|2017年|xxvii 913 p. :|共13页
会议地点
作者
Manar Mohamed Hafez; Mohamed Elemam Shehab; Essam El Fakharany; Abd El Ftah Abdel Ghfar Hegazy;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP182-532;
关键词
Big Data; Apache spark; Machine learning algorithms; Decision tree; Na?ve Bayes; Logistic regression; Gradient boosted trees; Random forest;

机译：大数据;apache spark;机器学习算法;决策树;na？ve贝叶斯;逻辑回归;梯度提升树木;随机森林;

相似文献

外文文献
中文文献
专利

1. Research on Visual Machine Learning Algorithms Based on Apache Spark in Big Data Environment [J] . Wang Jialin Basic & clinical pharmacology & toxicology. . 2019,第S1期

机译：基于Apache Spark在大数据环境中的视觉机器学习算法研究
2. Research on Visual Machine Learning Algorithms Based on Apache Spark in Big Data Environment [J] . Wang Jialin Basic & clinical pharmacology & toxicology. . 2019,第S3期

机译：基于Apache Spark在大数据环境中的视觉机器学习算法研究
3. Mobile big data analytics using deep learning and apache spark [J] . Mohammad Abu Alsheikh, Dusit Niyato, Shaowei Lin, IEEE Network . 2016,第3期

机译：使用深度学习和Apache Spark进行移动大数据分析
4. Effective Selection of Machine Learning Algorithms for Big Data Analytics Using Apache Spark [C] . Manar Mohamed Hafez, Mohamed Elemam Shehab, Essam El Fakharany, International Conference on Advanced Intelligent Systems and Informatics . 2017

机译：使用Apache Spark有效选择大数据分析的机器学习算法
5. Performance Evaluation of Machine Learning Algorithms in Apache Spark for Intrusion Detection [D] . Dobson, Anthony M. 2018

机译：用于入侵检测的Apache Spark中机器学习算法的性能评估
6. SPARK-MSNA: Efficient algorithm on Apache Spark for aligning multiple similar DNA/RNA sequences with supervised learning [O] . V. Vineetha, C. L. Biji, Achuthsankar S. Nair -1

机译：SPARK-MSNA：Apache Spark上的高效算法可通过监督学习将多个相似的DNA / RNA序列比对
7. Real-time Big Data Analytics for Feature Selection on Apache Spark [O] . 2020

机译：Apache Spark上的特征选择实时大数据分析

Effective Selection of Machine Learning Algorithms for Big Data Analytics Using Apache Spark

摘要

著录项

相似文献

相关主题

期刊订阅