首页> 外文会议>IEEE-RAS International Conference on Humanoid Robots >Learning Sequential Decision Tasks for Robot Manipulation with Abstract Markov Decision Processes and Demonstration-Guided Exploration
【24h】

Learning Sequential Decision Tasks for Robot Manipulation with Abstract Markov Decision Processes and Demonstration-Guided Exploration

机译:学习具有抽象马尔科夫决策过程和示范指导的机器人操纵的顺序决策任务

获取原文

摘要

Solving high-level sequential decision tasks situated on physical robots is a challenging problem. Reinforcement learning, the standard paradigm for solving sequential decision problems, allows robots to learn directly from experience, but is ill-equipped to deal with issues of scalability and uncertainty introduced by real-world tasks. We reformulate the problem representation to better apply to robot manipulation using the relations of Object-Oriented MDPs (OO-MDPs) and the hierarchical structure provided by Abstract MDPs (AMDPs). We present a relation-based AMDP formulation for solving tabletop organizational packing tasks, as well as a demonstration-guided exploration algorithm for learning AMDP transition functions inspired by state- and action-centric learning from demonstration approaches. We evaluate our representation and learning methods in a simulated environment, showing that our hierarchical representation is suitable for solving complex tasks, and that our state- and action-centric exploration biasing methods are both effective and complementary for efficiently learning AMDP transition functions. We show that the learned policy can be transferred to different tabletop organizational packing tasks, and validate that the policy can be realized on a physical system.
机译:解决位于物理机器人上的高级顺序决策任务是一个具有挑战性的问题。强化学习是解决顺序决策问题的标准范式,它使机器人可以直接从经验中学习,但装备不足,无法处理现实任务中引入的可伸缩性和不确定性问题。我们使用面向对象的MDP(OO-MDP)和抽象MDP(AMDP)提供的层次结构之间的关系,重新构造问题表示形式,以更好地应用于机器人操纵。我们提出了一种基于关系的AMDP公式来解决桌面组织包装任务,以及一种以演示为指导的探索算法,用于从演示方法中以状态和动作为中心的学习中学习AMDP过渡功能。我们在模拟环境中评估了表示和学习方法,表明我们的分层表示适合解决复杂的任务,而以状态和动作为中心的探索偏差方法对于有效学习AMDP过渡功能既有效又互补。我们表明,学习到的策略可以转移到不同的桌面组织打包任务中,并验证该策略可以在物理系统上实现。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号