首页> 外文会议>European Conference on the Applications of Evolutionary Computation >Evolution of Scikit-Learn Pipelines with Dynamic Structured Grammatical Evolution
【24h】

Evolution of Scikit-Learn Pipelines with Dynamic Structured Grammatical Evolution

机译:动态结构语法演化的巩膜学习管道的演变

获取原文

摘要

The deployment of Machine Learning (ML) models is a difficult and time-consuming job that comprises a series of sequential and correlated tasks that go from the data pre-processing, and the design and extraction of features, to the choice of the ML algorithm and its parameterisation. The task is even more challenging considering that the design of features is in many cases problem specific, and thus requires domain-expertise. To overcome these limitations Automated Machine Learning (AutoML) methods seek to automate, with few or no human-intervention, the design of pipelines, i.e., automate the selection of the sequence of methods that have to be applied to the raw data. These methods have the potential to enable non-expert users to use ML, and provide expert users with solutions that they would unlikely consider. In particular, this paper describes AutoML-DSGE - a novel grammar-based framework that adapts Dynamic Structured Grammatical Evolution (DSGE) to the evolution of Scikit-Learn classification pipelines. The experimental results include comparing AutoML-DSGE to another grammar-based AutoML framework, Resilient Classification Pipeline Evolution (RECIPE), and show that the average performance of the classification pipelines generated by AutoML-DSGE is always superior to the average performance of RECIPE; the differences are statistically significant in 3 out of the 10 used datasets.
机译:机器学习的部署(ML)模型是一种困难且耗时的作业,包括从数据预处理的一系列顺序和相关任务,以及特征的设计和提取,以选择ML算法及其参数化。考虑到特征在许多情况下,该任务更具挑战性更具挑战性,因此需要域名专业知识。为了克服这些限制,自动化机器学习(Automl)方法寻​​求自动化,只有少数或没有人类干预,管道的设计,即自动化选择必须应用于原始数据的方法序列。这些方法有可能使非专家用户能够使用ML,并为专家用户提供他们不可能考虑的解决方案。特别是,本文介绍了Automl-DSGE - 一种基于新型语法的框架,适应动态结构的语法演进(DSGE)到Scikit-Learing分类管道的演变。实验结果包括将自动化DSGE与基于另一语法的自动框架,弹性分类管道进化(配方)进行比较,并表明,Automl-DSGE产生的分类管道的平均性能总是优于配方的平均性能;在10个二手数据集中的3个中,差异在统计上显着。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号