首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Risk-Averse Stochastic Convex Bandit
【24h】

Risk-Averse Stochastic Convex Bandit

机译:规避风险的随机凸土匪

获取原文
           

摘要

Motivated by applications in clinical trials and finance, we study the problem of online convex optimization (with bandit feedback) where the decision maker is risk-averse. We provide two algorithms to solve this problem. The first one is a descent-type algorithm which is easy to implement. The second algorithm, which combines the ellipsoid method and a center point device, achieves (almost) optimal regret bounds with respect to the number of rounds. To the best of our knowledge this is the first attempt to address risk-aversion in the online convex bandit problem.
机译:受临床试验和金融应用的启发,我们研究了决策者具有规避风险的在线凸优化(带有强盗反馈)的问题。我们提供两种算法来解决此问题。第一个是易于实现的下降型算法。第二种算法结合了椭球方法和中心设备,相对于回合数实现了(几乎)最佳后悔边界。据我们所知,这是解决在线凸土匪问题中规避风险的首次尝试。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号