Reinforcement Learning and Apprenticeship Learning for Robotic Control

机译：机器人控制的强化学习和学徒学习

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Many robotic control problems, such as autonomous helicopter flight, legged robot locomotion, and autonomous driving, remain challenging even for modern reinforcement learning algorithms. Some of the reasons for these problems being challenging are (i) It can be hard to write down, in closed form, a formal specification of the control task (for example, what is the cost function for "driving well"?), (ii) It is often difficult to learn a good model of the robot's dynamics, (iii) Even given a complete specification of the problem, it is often computationally difficult to find good closed-loop controller for a high-dimensional, stochastic, control task. However, when we are allowed to learn from a human demonstration of a task - in other words, if we are in the apprenticeship learning setting - then a number of efficient algorithms can be used to address each of these problems. To motivate the first of the problems described above, consider the setting of teaching a young adult to drive, where rather than telling the student what the cost function is for driving, it is much easier and more natural to demonstrate driving to them, and have them learn from the demonstration. In practical applications, it is also (perhaps surprisingly) common practice to manually tweak cost functions until the correct behavior is obtained. Thus, we would like to devise algorithms that can learn from a teacher's demonstration, without needing to be explicitly told the cost function. For example, can we "guess" the teacher's cost function based on the demonstration, and use that in our own learning task? Ng and Russell [8] developed a set of inverse reinforcement learning algorithms for guessing the teacher's cost function. More recently, Abbeel and Ng [1] showed that even though the teacher's "true" cost function is ambiguous and thus can never be recovered, it is nevertheless possible to recover a cost function that allows us to learn a policy that has performance comparable to the teacher, where here performance is as evaluated on the teacher's unknown (and unknowable) cost function. Thus, access to a demonstration removes the need to explicitly write down a cost function.

机译：许多机器人控制问题，如自主直升机飞行，腿机器人机器人和自主驾驶，即使对于现代加固学习算法而言，仍然挑战。这些问题的一些原因是具有挑战性的（i）它可能很难在封闭形式下写下控制任务的正式规范（例如，“驾驶良好的成本函数”？），（ ii）往往难以学习机器人动态的好模型，（iii）甚至给出了一个完整的问题规范，通常难以找到高维，随机控制任务的良好闭环控制器。但是，当我们被允许从人类演示中学习任务 - 换句话说，如果我们在学徒学习设置中，那么可以使用许多有效的算法来解决这些问题中的每一个问题。为了激励上述第一个问题，考虑教授一个年轻人来开车的设置，而不是告诉学生的成本职能是为了驾驶，而且展示向他们开车更容易和更自然，并且有他们从演示中学习。在实际应用中，它也（也许令人惊讶地）常规做法，以便在获得正确的行为之前手动调整成本函数。因此，我们希望设计能够从教师演示中学习的算法，而无需明确地告诉成本函数。例如，我们可以根据演示，“猜测”教师的成本函数，并在我们自己的学习任务中使用这一点？ NG和Russell [8]开发了一系列反增强学习算法，用于猜测教师的成本函数。最近，ABBEEL和NG [1]表明，即使教师的“真实”成本函数是模糊的，因此不能恢复，因此可以恢复允许我们学习具有可比性的策略的成本函数老师，这里的表现在于教师未知（和不可知）成本函数的评估。因此，访问演示删除了明确地编写成本函数的需要。

著录项

来源
《International Conference on Algorithmic Learning Theory》|2006年||共3页
会议地点
作者
Andrew Y. Ng;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. Self-learning Processes in Smart Factories: Deep Reinforcement Learning for Process Control of Robot Brine Injection [J] . Rasmus E. Andersen, Steffen Madsen, Alexander B.K. Barlo, Procedia Manufacturing . 2019,第4期

机译：智能工厂中的自学习过程：机器人盐水注射过程控制的深度强化学习
2. Transfer learning with Partially Constrained Models: Application to reinforcement learning of linked multicomponent robot system control [J] . Borja Fernandez-Gauna, Jose Manuel Lopez-Guede, Manuel Grana Robotics and Autonomous Systems . 2013,第7期

机译：具有部分约束模型的转移学习：在链接多组件机器人系统控制的强化学习中的应用
3. Comparison of end-to-end and hybrid deep reinforcement learning strategies for controlling cable-driven parallel robots [J] . Xiong Hao, Ma Tianqi, Zhang Lin, Neurocomputing . 2020,第Feba15期

机译：控制电缆驱动并行机器人的端到端和混合深度强化学习策略的比较
4. Reinforcement Learning and Apprenticeship Learning for Robotic Control [C] . Andrew Y. Ng Algorithmic Learning Theory; Lecture Notes in Artificial Intelligence; 4264 . 2006

机译：机器人控制的强化学习和学徒学习
5. Apprenticeship learning and reinforcement learning with application to robotic control. [D] . Abbeel, Pieter. 2008

机译：学徒制学习和强化学习及其在机器人控制中的应用。
6. Learning for a Robot: Deep Reinforcement Learning Imitation Learning Transfer Learning [O] . Jiang Hua, Liangcai Zeng, Gongfa Li, 2021

机译：学习机器人：深增强学习仿制学习转移学习
7. Learning to Drive via Apprenticeship Learning and Deep Reinforcement Learning [O] . Wenhui Huang, Francesco Braghin, Zhuo Wang 2019

机译：通过学徒学习和深度加强学习学习
8. Apprenticeship Learning for Robotic Control. [R] . Abbeel, P. 2015

机译：学徒机器人学习机器人控制。

Reinforcement Learning and Apprenticeship Learning for Robotic Control

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅