Algorithm portfolio selection as a bandit problem with unbounded losses

Matteo Gagliolo; Juergen Schmidhuber

首页> 外文期刊>Annals of Mathematics and Artificial Intelligence >Algorithm portfolio selection as a bandit problem with unbounded losses

【24h】

Algorithm portfolio selection as a bandit problem with unbounded losses

机译：算法组合选择作为具有无限损失的强盗问题

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose a method that learns to allocate computation time to a given set of algorithms, of unknown performance, with the aim of solving a given sequence of problem instances in a minimum time. Analogous meta-learning techniques are typically based on models of algorithm performance, learned during a separate offline training sequence, which can be prohibitively expensive. We adopt instead an online approach, named GambleTA, in which algorithm performance models are iteratively updated, and used to guide allocation on a sequence of problem instances. GambleTA is a general method for selecting among two or more alternative algorithm portfolios. Each portfolio has its own way of allocating computation time to the available algorithms, possibly based on performance models, in which case its performance is expected to improve over time, as more runtime data becomes available. The resulting exploration-exploitation trade-off is represented as a bandit problem. In our previous work, the algorithms corresponded to the arms of the bandit, and allocations evaluated by the different portfolios were mixed, using a solver for the bandit problem with expert advice, but this required the setting of an arbitrary bound on algorithm runtimes, invalidating the optimal regret of the solver. In this paper, we propose a simpler version of GambleTA, in which the allocators correspond to the arms, such that a single portfolio is selected for each instance. The selection is represented as a bandit problem with partial information, and an unknown bound on losses. We devise a solver for this game, proving a bound on its expected regret. We present experiments based on results from several solver competitions, in various domains, comparing GambleTA with another online method.

机译：我们提出一种方法，该方法学会将计算时间分配给性能未知的给定算法集，目的是在最短时间内解决给定序列的问题实例。类似的元学习技术通常基于算法性能模型，该模型是在单独的脱机训练序列中学习的，这可能会非常昂贵。取而代之的是，我们采用一种名为GambleTA的在线方法，其中算法性能模型被迭代更新，并用于指导一系列问题实例的分配。 GambleTA是在两个或多个替代算法组合中进行选择的通用方法。每个产品组合都有自己的方法（可能基于性能模型）将计算时间分配给可用算法，在这种情况下，随着更多的运行时数据变得可用，其性能有望随着时间的推移而提高。由此产生的勘探与开发的权衡表现为土匪问题。在我们之前的工作中，算法对应于强盗集团，并使用专家建议对强盗问题使用求解器来混合由不同投资组合评估的分配，但这需要在算法运行时上设置任意界限，从而无效求解器的最佳遗憾。在本文中，我们提出了一个GambleTA的简单版本，其中分配器对应于分支，从而为每个实例选择单个投资组合。该选择表示为带有部分信息的匪徒问题，并且损失范围未知。我们为此游戏设计了一个求解器，证明其预期后悔是有限的。我们根据来自不同领域的多个求解器比赛的结果提出实验，将GambleTA与另一种在线方法进行比较。

著录项

来源
《Annals of Mathematics and Artificial Intelligence》 |2011年第2期|p.49-86|共38页
作者
Matteo Gagliolo; Juergen Schmidhuber;
展开▼
作者单位

CoMo, Vrije Universiteit Brussel, Pleinlaan 2,1050 Brussels, Belgium;

rnIDSIA, Galleria 2, 6928 Manno (Lugano), Switzerland Faculty of Informatics, University of Lugano, Via Buffi 13,6904 Lugano, Switzerland;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
algorithm; election; lgorithm; ortfolios; eta; earning; online; earning; ulti-armed; andit; roblem; urvival; nalysis; las; egas; lgorithms; omputational; omplexity; ombinatorial; ptimization; constraint; rogramming; atisfiability;

机译：算法;选举;算法组合eta;收入线上;收入武装andit;劫匪生存分析拉斯egas;算法脱口秀复杂性组合式最佳化约束;编程不能满足;

相似文献

外文文献
中文文献
专利

1. Risk-aware multi-armed bandit problem with application to portfolio selection [J] . Xiaoguang Huo, Feng Fu Royal Society Open Science . 2017,第11期

机译：具有风险意识的多武装匪徒问题及其在投资组合选择中的应用
2. Mean-variance portfolio selection in a complete market with unbounded random coefficients [J] . Shen Yang Automatica . 2015,第Null期

机译：具有无穷随机系数的完整市场中的均方差投资组合选择
3. Online Learning in Case of Unbounded Losses Using Follow the Perturbed Leader Algorithm [J] . Va€?yugin Vladimir V. Journal of machine learning research . 2011,第Jan期

机译：使用跟随领导算法在无限制损失的情况下进行在线学习
4. Algorithm Selection as a Bandit Problem with Unbounded Losses [C] . Matteo Gagliolo, Jiirgen Schmidhuber Learning and intelligent optimization . 2010

机译：算法选择作为具有无限损失的强盗问题
5. New algorithms for optimal portfolio selection. [D] . Magoc, Tanja. 2009

机译：用于优化投资组合选择的新算法。
6. Risk-aware multi-armed bandit problem with application to portfolio selection [O] . Xiaoguang Huo, Feng Fu 2017

机译：具有风险意识的多武装匪徒问题及其在投资组合选择中的应用
7. Algorithm Portfolio Selection as a Bandit Problem with Unbounded Losses [O] . Gagliolo Matteo, Schmidhuber Juergen 2011

机译：算法组合选择作为具有无限损失的强盗问题

Algorithm portfolio selection as a bandit problem with unbounded losses

摘要

著录项

相似文献

相关主题

期刊订阅