The interaction of representations and planning objectives for decision-theoretic planning tasks

SVEN KOENIG; YAXIN LIU

首页> 外文期刊>Journal of Experimental and Theoretical Artificial Intelligence >The interaction of representations and planning objectives for decision-theoretic planning tasks

【24h】

The interaction of representations and planning objectives for decision-theoretic planning tasks

机译：决策理论计划任务中表示和计划目标的交互

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This article studies decision-theoretic planning or reinforcement learning in the presence of traps such as steep slopes for outdoor robots or staircases for indoor robots. In this case, achieving the goal from the start is often the primary objective while minimizing the travel time is only of secondary importance. This article studies how this planning objective interacts with possible representations of the planning tasks, namely whether to use a discount factor that is one or smaller than one and whether to use the action-penalty or the goal-reward representation. It is shown that the action-penalty representation without discounting guarantees that the plan that maximizes the expected reward also achieves the goal from the start (provided that this is possible) but neither the action-penalty representation with discounting nor the goal-reward representation with discounting have this property. The article then shows exactly when this trapping phenomenon occurs, using a novel interpretation of discounting, namely that it models agents that use convex exponential utility functions and thus are optimistic in the face of uncertainty. Finally, it is shown how the selective state-deletion method can be used in conjunction with standard decision-theoretic planners to eliminate the trapping phenomenon.

机译：本文研究存在陷阱的决策理论计划或强化学习，例如室外机器人的陡坡或室内机器人的楼梯。在这种情况下，从一开始就实现目标通常是主要目标，而将行驶时间减至最少只是次要的。本文研究了此计划目标如何与计划任务的可能表示方式进行交互，即是否使用小于或等于1的折现因子以及是否使用行动惩罚或目标回报表示方式。结果表明，没有折现的行动惩罚表示法可以保证使预期报酬最大化的计划也从一开始就达到目标（前提是可以做到），但是既没有折现的行动惩罚表示法也没有从目标收益表示法实现的目标。打折拥有这个属性。然后，本文使用折现的新颖解释准确地显示了何时出现这种陷阱现象，即它对使用凸指数效用函数的代理进行建模，从而面对不确定性时保持乐观。最后，说明了如何将选择性状态删除方法与标准决策理论计划程序结合使用以消除陷阱现象。

著录项

来源
《Journal of Experimental and Theoretical Artificial Intelligence》 |2002年第4期|p.303-326|共24页
作者
SVEN KOENIG; YAXIN LIU;
展开▼
作者单位

College of Computing, Georgia Institute of Technology Atlanta, Georgia 30332-0280, USA;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计算技术、计算机技术;
关键词
decision-theoretic planning; reinforcement learning; planning objectives; planning-task representations; discounting; action-penalty representation; goal-reward representation; trapping phenomenon;

机译：决策理论计划;强化学习;规划目标;规划任务表示;折扣动作惩罚表示;目标奖励表示;陷阱现象;

相似文献

外文文献
中文文献
专利

1. Scripts and information units in future planning: Interactions between a past and a future planning task [J] . Cordonnier Aline, Barnier Amanda J., Sutton John The quarterly journal of experimental psychology: QJEP . 2016,第2期

机译：未来计划中的脚本和信息单元：过去和未来计划任务之间的交互
2. Evolutionary robust optimization in production planning - interactions between number of objectives, sample size and choice of robustness measure [J] . Diaz Juan Esteban, Handl Julia, Xu Dong-Ling Computers & operations research . 2017,第MARa期

机译：生产计划中的进化稳健优化-目标数量，样本数量和稳健性度量选择之间的相互作用
3. Evaluating ecological representation within differing planning objectives for the central coast of British Columbia [J] . Wells RW, Bunnell FL, Haag D, Canadian Journal of Forest Research . 2003,第11期

机译：在不列颠哥伦比亚省中部海岸的不同规划目标下评估生态代表性
4. Representations of Decision-Theoretic Planning Tasks [C] . Sven Koenig, Yaxin Liu International Conference on Artificial Intelligence Planning and Scheduling; 2000414-17; Breckenridge,CO(US) . 2000

机译：决策理论计划任务的表示
5. Decision-theoretic planning under risk-sensitive planning objectives. [D] . Liu, Yaxin. 2005

机译：风险敏感计划目标下的决策理论计划。
6. Underwater Robot Task Planning Using Multi-Objective Meta-Heuristics [O] . Itziar Landa-Torres, Diana Manjarres, Sonia Bilbao, 2017

机译：多目标元启发式的水下机器人任务计划
7. The interaction of representations and planning objectives for decision-theoretic planning tasks [O] . Sven Koenig, Yaxin Liu 2002

机译：决策理论计划任务的表示与计划目标的交互
8. Planning and Cost Sharing Policy Options for Water and Related Land Programs. Part 3B. Options for Planning Objectives: An Integrated Objective Approach to Multiple Objective Planning [R] . 1975

机译：水和相关土地计划的规划和成本分摊政策选择。第3B部分。规划目标的选择：多目标规划的综合目标方法

The interaction of representations and planning objectives for decision-theoretic planning tasks

摘要

著录项

相似文献

相关主题

期刊订阅