...
首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Information Directed Sampling for Linear Partial Monitoring
【24h】

Information Directed Sampling for Linear Partial Monitoring

机译:信息针对线性部分监控的采样

获取原文
           

摘要

Partial monitoring is a rich framework for sequential decision making under uncertainty that generalizes many well known bandit models, including linear, combinatorial and dueling bandits. We introduce {em information directed sampling} (IDS) for stochastic partial monitoring with a linear reward and observation structure. IDS achieves adaptive worst-case regret rates that depend on precise observability conditions of the game. Moreover, we prove lower bounds that classify the minimax regret of all finite games into four possible regimes. IDS achieves the optimal rate in all cases up to logarithmic factors, without tuning any hyper-parameters. We further extend our results to the contextual and the kernelized setting, which significantly increases the range of possible applications.
机译:部分监测是在不确定性下的顺序决策框架的丰富框架,以概括许多众所周知的强盗模型,包括线性,组合和决斗匪徒。我们介绍了具有线性奖励和观察结构的随机部分监测的{ EM信息定向采样}(IDS)。 IDS实现了适应性最坏情况的遗憾,这取决于游戏的精确可观察性条件。此外,我们证明了将所有有限游戏的最低限度遗憾分为四种可能的制度。 IDS在所有情况下实现最佳速率,无需调整任何超参数。我们进一步将结果扩展到上下文和内核设置,这显着增加了可能的应用范围。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号