首页> 外文期刊>Natural Computing >A study of model and hyper-parameter selection strategies for classifier ensembles: a robust analysis on different optimization algorithms and extended results
【24h】

A study of model and hyper-parameter selection strategies for classifier ensembles: a robust analysis on different optimization algorithms and extended results

机译:对分类算法的模型和超参数选择策略研究:不同优化算法和扩展结果的鲁棒分析

获取原文
获取原文并翻译 | 示例
           

摘要

It is well known that machine learning (ML) techniques have been playing an important role in several real world applications. However, one of the main challenges is the selection of the most accurate technique to be used in a specific application. In the classification context, for instance, two main approaches can be applied, model selection and hyperparameter selection. In the first approach, the best classification algorithm is selected for a given input dataset, by doing a heuristic search in a large space of candidate classification algorithms and their corresponding hyper-parameter settings. As the main focus of this approach is the selection of the classification algorithms, it is referred to as model selection and they are also called automated machine learning (Auto-ML). The second approach defines one classification system and performs an extensive search to select the best hyper-parameters for this model. In this paper, we perform a wide and robust comparative analysis of both approaches for Classifier Ensembles. In this analysis, two methods of the first approach (Auto-WEKA and H2O) are compared to four methods of the second approach (Genetic Algorithm, Particle Swarm Optimization, Tabu Search and GRASP). The main aim is to determine which of these techniques generate more accurate Classifier Ensembles, given a time constraint. Additionally, an empirical analysis will be conducted with 21 classification datasets for evaluating the performance of the aforementioned techniques. Our findings indicate that the use of a hyperparameter selection method provides the most accurate classifier ensembles, but this improvement was not detected by the statistical test.
机译:众所周知,机器学习(ML)技术在几个真实世界应用中都在发挥着重要作用。然而,主要挑战之一是选择在特定应用中使用的最准确的技术。例如,在分类上下文中,可以应用两种主要方法,模拟选择和超参数选择。在第一种方法中,通过在大型候选分类算法和它们相应的超参数设置中进行启发式搜索,为给定输入数据集选择最佳分类算法。由于这种方法的主要重点是选择分类算法,它被称为模型选择,它们也称为自动化机器学习(自动ML)。第二种方法定义一个分类系统,并执行广泛的搜索以选择该模型的最佳超参数。在本文中,我们对分类器集合的两种方法进行了广泛而坚固的比较分析。在该分析中,将第一种方法(自动WEKA和H2O)的两种方法与第二种方法的四种方法进行比较(遗传算法,粒子群优化,禁忌搜索和掌握)。主要目的是,给定时间约束,确定哪些技术生成更准确的分类器集合。另外,将用21种分类数据集进行实证分析,用于评估上述技术的性能。我们的调查结果表明,使用超级计数器选择方法提供了最准确的分类器集合,但统计测试未检测到这种改进。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号