...
首页> 外文期刊>Journal of Experimental and Theoretical Artificial Intelligence >The interaction of representations and planning objectives for decision-theoretic planning tasks
【24h】

The interaction of representations and planning objectives for decision-theoretic planning tasks

机译:决策理论计划任务中表示和计划目标的交互

获取原文
获取原文并翻译 | 示例
           

摘要

This article studies decision-theoretic planning or reinforcement learning in the presence of traps such as steep slopes for outdoor robots or staircases for indoor robots. In this case, achieving the goal from the start is often the primary objective while minimizing the travel time is only of secondary importance. This article studies how this planning objective interacts with possible representations of the planning tasks, namely whether to use a discount factor that is one or smaller than one and whether to use the action-penalty or the goal-reward representation. It is shown that the action-penalty representation without discounting guarantees that the plan that maximizes the expected reward also achieves the goal from the start (provided that this is possible) but neither the action-penalty representation with discounting nor the goal-reward representation with discounting have this property. The article then shows exactly when this trapping phenomenon occurs, using a novel interpretation of discounting, namely that it models agents that use convex exponential utility functions and thus are optimistic in the face of uncertainty. Finally, it is shown how the selective state-deletion method can be used in conjunction with standard decision-theoretic planners to eliminate the trapping phenomenon.
机译:本文研究存在陷阱的决策理论计划或强化学习,例如室外机器人的陡坡或室内机器人的楼梯。在这种情况下,从一开始就实现目标通常是主要目标,而将行驶时间减至最少只是次要的。本文研究了此计划目标如何与计划任务的可能表示方式进行交互,即是否使用小于或等于1的折现因子以及是否使用行动惩罚或目标回报表示方式。结果表明,没有折现的行动惩罚表示法可以保证使预期报酬最大化的计划也从一开始就达到目标(前提是可以做到),但是既没有折现的行动惩罚表示法也没有从目标收益表示法实现的目标。打折拥有这个属性。然后,本文使用折现的新颖解释准确地显示了何时出现这种陷阱现象,即它对使用凸指数效用函数的代理进行建模,从而面对不确定性时保持乐观。最后,说明了如何将选择性状态删除方法与标准决策理论计划程序结合使用以消除陷阱现象。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号