首页> 外文会议>IEEE-RAS International Conference on Humanoid Robots >Learning Sequential Decision Tasks for Robot Manipulation with Abstract Markov Decision Processes and Demonstration-Guided Exploration

【24h】

Learning Sequential Decision Tasks for Robot Manipulation with Abstract Markov Decision Processes and Demonstration-Guided Exploration

机译：学习具有抽象马尔科夫决策过程和示范指导的机器人操纵的顺序决策任务

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Solving high-level sequential decision tasks situated on physical robots is a challenging problem. Reinforcement learning, the standard paradigm for solving sequential decision problems, allows robots to learn directly from experience, but is ill-equipped to deal with issues of scalability and uncertainty introduced by real-world tasks. We reformulate the problem representation to better apply to robot manipulation using the relations of Object-Oriented MDPs (OO-MDPs) and the hierarchical structure provided by Abstract MDPs (AMDPs). We present a relation-based AMDP formulation for solving tabletop organizational packing tasks, as well as a demonstration-guided exploration algorithm for learning AMDP transition functions inspired by state- and action-centric learning from demonstration approaches. We evaluate our representation and learning methods in a simulated environment, showing that our hierarchical representation is suitable for solving complex tasks, and that our state- and action-centric exploration biasing methods are both effective and complementary for efficiently learning AMDP transition functions. We show that the learned policy can be transferred to different tabletop organizational packing tasks, and validate that the policy can be realized on a physical system.

机译：解决位于物理机器人上的高级顺序决策任务是一个具有挑战性的问题。强化学习是解决顺序决策问题的标准范式，它使机器人可以直接从经验中学习，但装备不足，无法处理现实任务中引入的可伸缩性和不确定性问题。我们使用面向对象的MDP（OO-MDP）和抽象MDP（AMDP）提供的层次结构之间的关系，重新构造问题表示形式，以更好地应用于机器人操纵。我们提出了一种基于关系的AMDP公式来解决桌面组织包装任务，以及一种以演示为指导的探索算法，用于从演示方法中以状态和动作为中心的学习中学习AMDP过渡功能。我们在模拟环境中评估了表示和学习方法，表明我们的分层表示适合解决复杂的任务，而以状态和动作为中心的探索偏差方法对于有效学习AMDP过渡功能既有效又互补。我们表明，学习到的策略可以转移到不同的桌面组织打包任务中，并验证该策略可以在物理系统上实现。

著录项

来源
《IEEE-RAS International Conference on Humanoid Robots》|2018年|1-8|共8页
会议地点
作者
David Kent; Siddhartha Banerjee; Sonia Chernova;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Task analysis; Robots; Containers; Planning; Object oriented modeling; Scalability; Grippers;

机译：任务分析;机器人;容器;计划;面向对象建模;可伸缩性;抓手;

相似文献

外文文献
中文文献
专利

1. Episodic task learning in Markov decision processes [J] . Yong Lin, Fillia Makedon, Yurong Xu Artificial Intelligence Review: An International Science and Engineering Journal . 2011,第2期

机译：马尔可夫决策过程中的情景任务学习
2. Episodic task learning in Markov decision processes [J] . Yong Lin, Fillia Makedon, Yurong Xu Artificial Intelligence Review . 2011,第2期

机译：马尔可夫决策过程中的情景任务学习
3. Concurrent Markov decision processes for robot team learning [J] . Justin Girard, M. Reza Emami Engineering Applications of Artificial Intelligence . 2015,第mara期

机译：机器人团队学习的并行马尔可夫决策过程
4. Learning Sequential Decision Tasks for Robot Manipulation with Abstract Markov Decision Processes and Demonstration-Guided Exploration [C] . David Kent, Siddhartha Banerjee, Sonia Chernova IEEE-RAS International Conference on Humanoid Robotics . 2018

机译：用抽象的马尔可夫决策过程学习机器人操纵的顺序决策任务和演示引导探索
5. Learning partially observable Markov decision processes using abstract actions. [D] . Janzadeh, Hamed. 2012

机译：使用抽象动作学习部分可观察的马尔可夫决策过程。
6. Evaluation of linearly solvable Markov decision process with dynamic model learning in a mobile robot navigation task [O] . Ken Kinjo, Eiji Uchibe, Kenji Doya 2013

机译：动态模型学习在移动机器人导航任务中线性可解马尔可夫决策过程的评估
7. Practical reinforcement learning using representation learning and safe exploration for large scale Markov decision processes [O] . Geramifard Alborz 1980- 2012

机译：使用表示学习和大规模马尔可夫决策过程的安全探索实践强化学习

Learning Sequential Decision Tasks for Robot Manipulation with Abstract Markov Decision Processes and Demonstration-Guided Exploration

摘要

著录项

相似文献

相关主题

期刊订阅