首页> 外文会议>Algorithmic Learning Theory; Lecture Notes in Artificial Intelligence; 4264 >Asymptotic Learnability of Reinforcement Problems with Arbitrary Dependence
【24h】

Asymptotic Learnability of Reinforcement Problems with Arbitrary Dependence

机译:具有任意依赖性的钢筋问题的渐近可学习性

获取原文
获取原文并翻译 | 示例

摘要

We address the problem of reinforcement learning in which observations may exhibit an arbitrary form of stochastic dependence on past observations and actions, i.e. environments more general than (PO) MDPs. The task for an agent is to attain the best possible asymptotic reward where the true generating environment is unknown but belongs to a known countable family of environments. We find some sufficient conditions on the class of environments under which an agent exists which attains the best asymptotic reward for any environment in the class. We analyze how tight these conditions are and how they relate to different probabilistic assumptions known in reinforcement learning and related fields, such as Markov Decision Processes and mixing conditions.
机译:我们解决了强化学习的问题,在这种学习中,观察值可能表现出对过去观察值和动作的随机依赖,即比(PO)MDP更普遍的环境。在真正的生成环境未知但属于已知的可数环境族的情况下,代理商的任务是获得最佳的渐近奖励。我们在存在代理的环境类别中找到了一些充分条件,对于该类别中的任何环境,代理都能获得最佳的渐近奖励。我们分析了这些条件的严格程度以及它们与强化学习及相关领域(例如马尔可夫决策过程和混合条件)中已知的不同概率假设之间的关系。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号