...
首页> 外文期刊>Intelligent data analysis >Facing the full model selection problem in high volume datasets employing intelligent proxy models
【24h】

Facing the full model selection problem in high volume datasets employing intelligent proxy models

机译:面对智能代理模型的高卷数据集中的全模型选择问题

获取原文
获取原文并翻译 | 示例

摘要

Full model selection is a technique to improve the accuracy of machine learning algorithms through the search of the most appropriate combination on each dataset of feature selection, data preparation, a learning algorithm and the adjustment of its hyper-parameters. This paradigm has been widely studied in datasets of moderate size, but poorly explored in high volume datasets. One of the main reasons is the high search space and an elevated number of fitness evaluations of candidate models. In order to overcome this obstacle, the use of proxy models or surrogate functions has been proposed in the literature. In this work, we propose the use of the full model selection paradigm to construct proxy models. Such proxy models were employed to assist in the search of models in high volume datasets in order to reduce the number of fitness evaluations and to guide the search. The obtained results, show a performance without significant differences in comparison to the complete search algorithm, using just the third part of the expensive fitness evaluations.
机译:完整的模型选择是一种通过在特征选择,数据准备,学习算法的每个数据集上搜索最合适的组合来提高机器学习算法的准确性,数据准备,学习算法和其超参数的调整。此范例已被广泛研究在中等大小的数据集中,但在高批量数据集中探索不佳。其中一个主要原因是候选模型的高搜索空间和升高的健身评估数。为了克服这种障碍,在文献中提出了代理模型或代理功能的使用。在这项工作中,我们建议使用完整的模型选择范式来构建代理模型。这些代理模型被采用来帮助在高卷数据集中搜索模型,以减少健身评估的数量并指导搜索。与完整的搜索算法相比,所获得的结果显示出没有显着差异的表现,仅使用昂贵的健身评估的第三部分。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号