首页> 外文会议>International Conference on Advanced Intelligent Systems and Informatics >Effective Selection of Machine Learning Algorithms for Big Data Analytics Using Apache Spark
【24h】

Effective Selection of Machine Learning Algorithms for Big Data Analytics Using Apache Spark

机译:使用Apache Spark有效选择大数据分析的机器学习算法

获取原文

摘要

Big Data appears with not only the increasing size of data but also complex and different processing and analytical tools. This research aims to compare some selected machine learning algorithms on datasets of different types and sizes using Apache spark tool in order to make a fair judgment about which one is the best fitting in. The algorithms were compared based on few parameters including mainly accuracy and training time. The algorithms were applied on three datasets of different fields: marketing, packing and statistics, and security datasets. The findings of this experiment show that the decision tree algorithm is the most suitable algorithm for marketing and security datasets. Additionally, logistic regression algorithm had the highest accuracy for packing and statistics dataset.
机译:出现大数据不仅具有越来越大的数据大小,而且具有复杂和不同的处理和分析工具。本研究旨在使用Apache Spark工具比较不同类型的数据集上的一些选定的机器学习算法,并使用Apache Spark工具进行尺寸的尺寸,以便进行公平判断哪一个是哪一个是最适合的。基于少数参数比较了算法,包括主要是准确性和培训时间。算法应用于不同领域的三个数据集:营销,包装和统计数据和安全数据集。该实验的发现表明,决策树算法是最合适的营销和安全数据集的算法。此外,Logistic回归算法具有最高的包装和统计数据集的准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号