首页> 外文会议>Uncertainty in artificial intelligence >PAC-Bayesian Policy Evaluation for Reinforcement Learning
【24h】

PAC-Bayesian Policy Evaluation for Reinforcement Learning

机译:PAC-贝叶斯强化学习政策评估

获取原文
获取原文并翻译 | 示例

摘要

Bayesian priors offer a compact yet general means of incorporating domain knowledge into many learning tasks. The correctness of the Bayesian analysis and inference, however, largely depends on accuracy and correctness of these priors. PAC-Bayesian methods over come this problem by providing bounds that hold regardless of the correctness of the prior distribution. This paper introduces the first PAC-Bayesian bound for the batch reinforce ment learning problem with function approx imation. We show how this bound can be used to perform model-selection in a trans fer learning scenario. Our empirical results confirm that PAC-Bayesian policy evaluation is able to leverage prior distributions when they are informative and, unlike standard Bayesian RL approaches, ignore them when they are misleading.
机译:贝叶斯先验提供了一种紧凑但通用的方法,可以将领域知识整合到许多学习任务中。但是,贝叶斯分析和推断的正确性在很大程度上取决于这些先验的准确性和正确性。 PAC-贝叶斯方法通过提供无论先验分布的正确性如何都成立的边界来解决此问题。本文介绍了函数逼近的批量强化学习问题的第一个PAC-贝叶斯界。我们展示了如何在转移学习场景中使用此界限执行模型选择。我们的经验结果证实,PAC-贝叶斯政策评估在提供信息时可以利用先前的分布,并且与标准贝叶斯RL方法不同,当它们具有误导性时,可以忽略它们。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号