首页> 外文会议>International Conference on Machine Learning, Optimization, and Data Science >Trading-off Data Fit and Complexity in Training Gaussian Processes with Multiple Kernels
【24h】

Trading-off Data Fit and Complexity in Training Gaussian Processes with Multiple Kernels

机译:培训高斯流程的交易数据适合和复杂性与多个内核

获取原文

摘要

Gaussian processes (GPs) belong to a class of probabilistic techniques that have been successfully used in different domains of machine learning and optimization. They are popular because they provide uncertainties in predictions, which sets them apart from other modelling methods providing only point predictions. The uncertainty is particularly useful for decision making as we can gauge how reliable a prediction is. One of the fundamental challenges in using GPs is that the efficacy of a model is conferred by selecting an appropriate kernel and the associated hyperparameter values for a given problem. Furthermore, the training of GPs, that is optimizing the hyperparameters using a data set is traditionally performed using a cost function that is a weighted sum of data fit and model complexity, and the underlying trade-off is completely ignored. Addressing these challenges and shortcomings, in this article, we propose the following automated training scheme. Firstly, we use a weighted product of multiple kernels with a view to relieve the users from choosing an appropriate kernel for the problem at hand without any domain specific knowledge. Secondly, for the first time, we modify GP training by using a multi-objective optimizer to tune the hyperparameters and weights of multiple kernels and extract an approximation of the complete trade-off front between data-fit and model complexity. We then propose to use a novel solution selection strategy based on mean standardized log loss (MSLL) to select a solution from the estimated trade-off front and finalise training of a GP model. The results on three data sets and comparison with the standard approach clearly show the potential benefit of the proposed approach of using multi-objective optimization with multiple kernels.
机译:高斯进程(GPS)属于一类已经成功用于机器学习和优化领域的概率技术。它们很受欢迎,因为它们在预测中提供了不确定性,从其他建模方法分开,只提供仅点预测的其他建模方法。不确定性对于决策,特别是可以衡量预测的可靠性是特别有用的。使用GPS的一个基本挑战是通过选择适当的内核和给定问题的相关的封闭式计值来赋予模型的功效。此外,使用数据集优化使用数据集优化的GPS的训练传统上使用成本函数执行,该成本函数是加权和模型复杂性的加权之和,并且完全忽略底层权衡。解决这些挑战和缺点,在本文中,我们提出了以下自动化培训计划。首先,我们使用多个内核的加权乘积,以缓解用户在没有任何域特异性知识的情况下为手掌选择适当的内核。其次,我们首次通过使用多目标优化器来修改GP培训来调整多个内核的超参数和权重,并在数据适合和模型复杂性之间提取完整权衡前面的近似。然后,我们建议使用基于平均标准化的日志损耗(MSLL)的新型解决方案选择策略,从估计的权衡前面选择解决方案并完成GP模型的培训。结果三个数据集和标准方法的比较清楚地表明了使用多个内核的多目标优化的提出方法的潜在好处。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号