首页> 外文会议>European Conference on Artificial Intelligence >Analysis of Inverse Reinforcement Learning with Perturbed Demonstrations
【24h】

Analysis of Inverse Reinforcement Learning with Perturbed Demonstrations

机译:扰动演示分析逆钢筋学习

获取原文

摘要

Inverse reinforcement learning (IRL) addresses the problem of recovering the unknown reward function for a given Markov decision problem (MDP) given the corresponding optimal policy or a perturbed version thereof. This paper studies the space of possible solutions to the general IRL problem, when the agent is provided with incomplete/imperfect information regarding the optimal policy for the MDP whose reward must be estimated. We focus on scenarios with finite state-action spaces and discuss the constraints imposed on the set of possible solutions when the agent is provided with (i) perturbed policies; (ii) optimal policies; and (iii) incomplete policies. We discuss previous works on IRL in light of our analysis and show that, with our characterization of the solution space, it is possible to determine non-trivial closed-form solutions for the IRL problem. We also discuss several other interesting aspects of the IRL problem that stem from our analysis.
机译:逆加强学习(IRL)解决了给定马尔可夫决策问题(MDP)恢复未知奖励函数的问题给定相应的最佳策略或其扰动版本。本文研究了一般IRL问题的可能解决方案的空间,当时代理商提供了关于必须估算奖励的MDP的最佳政策的不完整信息。我们专注于具有有限状态空间的场景,并在代理提供(i)扰动的政策时讨论在可能的解决方案集中强加的约束; (ii)最佳政策; (iii)不完整的政策。我们根据我们的分析讨论了IRL上的先前作品,并表明,随着我们对解决方案的表征,可以确定IRL问题的非普通闭合液解决方案。我们还讨论了来自我们分析的IRL问题的几个其他有趣方面。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号