首页> 外文期刊>International Journal of Hybrid Intelligent Systems >Predicting execution time of machine learning tasks for scheduling
【24h】

Predicting execution time of machine learning tasks for scheduling

机译:预测机器学习任务的执行时间以进行调度

获取原文
获取原文并翻译 | 示例
           

摘要

Lately, many academic and industrial fields have shifted their research focus from data acquisition to data analysis. This transition has been facilitated by the usage of Machine Learning (ML) techniques to automatically identify patterns and extract non-trivial knowledge from data. The experimental procedures associated with that are usually complex and computationally demanding. To deal with such scenario, Distributed Heterogeneous Computing (DHC) systems can be employed. In order to fully benefit from DHT facilities, a suitabble scheduling policy should be applied to decide how to allocate tasks into the available resources. An important step for such is to guess how long an application would take to execute. In this paper, we present an approach for predicting execution time specifically of ML tasks. It employs a metalearning framework to relate characteristics of datasets and current machine state to actual execution time. An empirical study was conducted using 78 publicly available datasets, 6 ML algorithms and 4 meta-regressors. Experimental results show that our approach outperforms a commonly used baseline method. After establishing SVM as the most promising meta-regressor, we employed its predictions to actually build schedule plans. In a simulation considering a small scale DHC enviroment, a simple Genetic Algorithm based scheduler was employed for task allocation, leading to minimized overall completion time. These achievements indicate the potential of meta-learning to tackle the problem and encourage further developments.
机译:最近,许多学术和工业领域将其研究重点从数据采集转移到了数据分析。通过使用机器学习(ML)技术自动识别模式并从数据中提取非平凡的知识,已经促进了这种转变。与之相关的实验程序通常很复杂且计算量很大。为了应对这种情况,可以采用分布式异构计算(DHC)系统。为了充分利用DHT设施,应应用适应性调度策略来决定如何将任务分配到可用资源中。这样做的一个重要步骤是猜测应用程序将执行多长时间。在本文中,我们提出了一种预测ML任务执行时间的方法。它采用金属学习框架将数据集的特征和当前计算机状态与实际执行时间相关联。使用78个公共可用数据集,6个ML算法和4个元回归进行了实证研究。实验结果表明,我们的方法优于常用的基线方法。在将SVM建立为最有前途的元回归器之后,我们利用其预测来实际构建计划计划。在考虑小规模DHC环境的模拟中,采用了基于遗传算法的简单调度程序进行任务分配,从而使总完成时间最小化。这些成就表明了元学习解决问题和鼓励进一步发展的潜力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号