首页> 外文会议>Annual conference on Neural Information Processing Systems >Causal Bandits: Learning Good Interventions via Causal Inference
【24h】

Causal Bandits: Learning Good Interventions via Causal Inference

机译:因果匪徒:通过因果推断学习良好的干预措施

获取原文

摘要

We study the problem of using causal models to improve the rate at which good interventions can be learned online in a stochastic environment. Our formalism combines multi-arm bandits and causal inference to model a novel type of bandit feedback that is not exploited by existing approaches. We propose a new algorithm that exploits the causal feedback and prove a bound on its simple regret that is strictly better (in all quantities) than algorithms that do not use the additional causal information.
机译:我们研究使用因果模型来提高在随机环境中可以在线学习良好干预措施的速率的问题。我们的形式主义结合了多臂匪和因果推理来模拟一种新型的强盗反馈,这些反馈不会被现有方法利用。我们提出了一种新的算法,利用了因果反馈,并证明了其简单遗憾的界限,这比不使用附加因果信息的算法更好地更好地(在所有数量中)。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号