首页> 外文会议>IEEE International Conference on Cloud Computing Technology and Science >Optimizing Multiple Machine Learning Jobs on MapReduce
【24h】

Optimizing Multiple Machine Learning Jobs on MapReduce

机译:优化MapReduce上的多台机器学习作业

获取原文

摘要

Recently, MapReduce has been used to parallelize machine learning algorithms. To obtain the best performance for these algorithms, tuning the parameters of the algorithms is required. However, this is time consuming because it requires executing a MapReduce program multiple times using various parameters. Such multiple executions can be assigned to a cluster in various ways, and the execution time varies depending on the assignments. To achieve the shortest execution time, we propose a method for optimizing the assignment of MapReduce jobs to a cluster assuming machine learning targeted runtime. We developed an execution cost model to predict the total execution time of jobs and obtained the optimal assignment by minimizing the cost model. To evaluate the proposed method, we implemented an experimental MapReduce runtime based on Message Passing Interface and executed logistic regression in four cases. The results showed that the proposed method can correctly predict the optimal job assignment. We also confirmed that the optimal assignment reduced execution time by a maximum 77% compared to the worst assignment.
机译:最近,MapReduce已被用来并行化机器学习算法。为了获得这些算法的最佳性能,需要调整算法的参数。但是,这是耗时的,因为它需要多次执行MapReduce程序,使用各种参数。这种多个执行可以以各种方式分配给群集,并且执行时间根据分配而变化。为了实现最短的执行时间,我们提出了一种解决方法,用于优化MapReduce作业的分配给群集假设机器学习目标运行时。我们开发了执行成本模型,以预测作业的总执行时间,并通过最小化成本模型获得最佳分配。为了评估所提出的方法,我们基于消息传递接口和四种情况下的逻辑回归来实现了一个实验MapReduce运行时。结果表明,所提出的方法可以正确预测最佳工作分配。我们还确认,与最差的分配相比,最佳分配将执行时间减少最高77%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号