Facing the full model selection problem in high volume datasets employing intelligent proxy models

Diaz-Pacheco Angel; Reyes-Garcia Carlos A.

首页> 外文期刊>Intelligent data analysis >Facing the full model selection problem in high volume datasets employing intelligent proxy models

【24h】

Facing the full model selection problem in high volume datasets employing intelligent proxy models

机译：面对智能代理模型的高卷数据集中的全模型选择问题

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Full model selection is a technique to improve the accuracy of machine learning algorithms through the search of the most appropriate combination on each dataset of feature selection, data preparation, a learning algorithm and the adjustment of its hyper-parameters. This paradigm has been widely studied in datasets of moderate size, but poorly explored in high volume datasets. One of the main reasons is the high search space and an elevated number of fitness evaluations of candidate models. In order to overcome this obstacle, the use of proxy models or surrogate functions has been proposed in the literature. In this work, we propose the use of the full model selection paradigm to construct proxy models. Such proxy models were employed to assist in the search of models in high volume datasets in order to reduce the number of fitness evaluations and to guide the search. The obtained results, show a performance without significant differences in comparison to the complete search algorithm, using just the third part of the expensive fitness evaluations.

机译：完整的模型选择是一种通过在特征选择，数据准备，学习算法的每个数据集上搜索最合适的组合来提高机器学习算法的准确性，数据准备，学习算法和其超参数的调整。此范例已被广泛研究在中等大小的数据集中，但在高批量数据集中探索不佳。其中一个主要原因是候选模型的高搜索空间和升高的健身评估数。为了克服这种障碍，在文献中提出了代理模型或代理功能的使用。在这项工作中，我们建议使用完整的模型选择范式来构建代理模型。这些代理模型被采用来帮助在高卷数据集中搜索模型，以减少健身评估的数量并指导搜索。与完整的搜索算法相比，所获得的结果显示出没有显着差异的表现，仅使用昂贵的健身评估的第三部分。

著录项

来源
《Intelligent data analysis 》 |2019年第5期| 1109-1129| 共21页
作者
Diaz-Pacheco Angel; Reyes-Garcia Carlos A.;
展开▼
作者单位

Inst Nacl Astrofis Opt & Electra Comp Sci Dept Luis Enrique Erro 1 Puebla 72840 Mexico;

Inst Nacl Astrofis Opt & Electra Comp Sci Dept Luis Enrique Erro 1 Puebla 72840 Mexico;

展开▼
收录信息美国《科学引文索引》(SCI);
原文格式 PDF
正文语种 eng
中图分类
关键词
Proxy models; model selection; big datasets;

机译：代理模型;模型选择;大数据集;

相似文献

外文文献
中文文献
专利

1. New Workflow for QSAR Model Development from Small Data Sets: Small Dataset Curator and Small Dataset Modeler. Integration of Data Curation, Exhaustive Double Cross-Validation, and a Set of Optimal Model Selection Techniques [J] . Ambure Pravin, Gajewicz-Skretna Agnieszka, Cordeiro M. Natalia D. S., Journal of chemical information and modeling . 2019 ,第10期

机译：来自小型数据集的QSAR模型开发的新工作流程：小型数据集策划器和小型数据集型号。数据策择集成，详尽的双交叉验证以及一组最佳模型选择技术
2. Towards a New Approach for Modeling Volume Datasets Based on Orthogonal Polytopes in Four-Dimensional Color Space [J] . Ricardo Perez-Aguila Engineering Letters . 2010 ,第4期

机译：寻求一种在多维色彩空间中基于正交多边形的体数据集建模的新方法
3. Historical Datasets Support Genomic Selection Models for the Prediction of Cotton Fiber Quality Phenotypes Across Multiple Environments [J] . Washington Gapare, Shiming Liu, Warren Conaty, G3: Genes, Genomes, Genetics . 2018 ,第5期

机译：历史数据集支持基因组选择模型，用于在多种环境中预测棉纤维质量表型
4. Full Model Selection in Huge Datasets and for Proxy Models Construction [C] . Angel Diaz-Pacheco, Carlos Alberto Reyes-Garcia Mexican international conference on artificial intelligence . 2018

机译：庞大数据集中的完整模型选择以及代理模型的构建
5. Development of selection evaluation and system intelligence analytic models for the intelligent building control systems. [D] . Wong, Kwok Wai Johnny. 2007

机译：开发智能建筑控制系统的选择评估和系统智能分析模型。
6. Historical Datasets Support Genomic Selection Models for the Prediction of Cotton Fiber Quality Phenotypes Across Multiple Environments [O] . Washington Gapare, Shiming Liu, Warren Conaty, 2018

机译：历史数据集支持基因组选择模型用于在多种环境中预测棉纤维质量表型
7. Volumes of expression: artistic modelling and rendering of volume datasets [O] . Jones, Mark 2001

机译：表达量：体积数据集的艺术建模和渲染

Facing the full model selection problem in high volume datasets employing intelligent proxy models

摘要

著录项

相似文献

相关主题

期刊订阅