Robust Risk-Averse Stochastic Multi-armed Bandits

机译：强大的风险风险厌恶随机多武装匪

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We study a variant of the standard stochastic multi-armed bandit problem when one is not interested in the arm with the best mean, but instead in the arm maximizing some coherent risk measure criterion. Further, we are studying the deviations of the regret instead of the less informative expected regret. We provide an algorithm, called RA-UCB to solve this problem, together with a high probability bound on its regret.

机译：我们研究了标准随机多武装强盗问题的变种，当一个人对具有最佳平均值的手臂不感兴趣时，而是在手臂中最大化一些连贯的风险衡量标准。此外，我们正在研究遗憾的偏差，而不是较少的信息预期遗憾。我们提供一种称为RA-UCB的算法，可以解决这个问题，以及其遗憾的高概率。

著录项

来源
《International Conference on Algorithmic Learning Theory》|2013年||共16页
会议地点
作者
Odalric-Ambrym Maillard;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP301.6-53;
关键词
Multi-armed bandits; Coherent risk measure; Cumulant generative function; Concentration of measure;

机译：多武装匪徒;相干风险措施;累积工成功能;测量集中;

相似文献

外文文献
中文文献
专利

1. Residential HVAC Aggregation Based on Risk-averse Multi-armed Bandit Learning for Secondary Frequency Regulation [J] . Xinyi Chen, Qinran Hu, Qingxin Shi, 现代电力系统与清洁能源学报(英文) . 2020,第006期

机译：基于风险厌恶多武装强盗学习的次级频率调节的住宅HVAC聚集
2. Risk-Averse Multi-Armed Bandit Problems Under Mean-Variance Measure [J] . Sattar Vakili, Qing Zhao Selected Topics in Signal Processing, IEEE Journal of . 2016,第6期

机译：均值方差测度下的规避风险的多武装强盗问题
3. Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems [J] . Sebastien Bubeck, Nicolo Cesa-Bianchi Foundations and trends in machine learning . 2012,第1期

机译：随机和非随机多臂匪问题的遗憾分析
4. Robust Risk-Averse Stochastic Multi-armed Bandits [C] . Odalric-Ambrym Maillard International conference on algorithmic learning theory . 2013

机译：健壮的规避风险的随机多武装土匪
5. From Stability to Low-Regret Algorithms in Stochastic Multi-Armed Bandits [D] . Huang, Kuan-Sung. 2021

机译：从随机多武装匪中的低遗憾算法到低遗憾算法
6. An Analysis of the Value of Information When Exploring Stochastic Discrete Multi-Armed Bandits [O] . Isaac J. Sledge, José C. Príncipe 2018

机译：探索随机离散多武装匪徒信息的价值分析
7. Robust Risk-averse Stochastic Multi-Armed Bandits [O] . Odalric-ambrym Maillard 2013

机译：强大的风险规避随机多臂土匪

Robust Risk-Averse Stochastic Multi-armed Bandits

摘要

著录项

相似文献

相关主题

期刊订阅