首页> 外文学位 >Reinforcement learning for robots using neural networks.
【24h】

Reinforcement learning for robots using neural networks.

机译:使用神经网络的机器人强化学习。

获取原文
获取原文并翻译 | 示例

摘要

Reinforcement learning agents are adaptive, reactive, and self-supervised. The aim of this dissertation is to extend the state of the art of reinforcement learning and enable its applications to complex robot-learning problems. In particular, it focuses on two issues. First, learning from sparse and delayed reinforcement signals is hard and in general a slow process. Techniques for reducing learning time must be devised. Second, most existing reinforcement learning methods assume that the world is a Markov decision process. This assumption is too strong for many robot tasks of interest.;This dissertation demonstrates how we can possibly overcome the slow learning problem and tackle non-Markovian environments, making reinforcement learning more practical for realistic robot tasks: (1) Reinforcement learning can be naturally integrated with artificial neural networks to obtain high-quality generalization, resulting in a significant learning speedup. Neural networks are used in this dissertation, and they generalize effectively even in the presence of noise and a large of binary and real-valued inputs. (2) Reinforcement learning agents can save many learning trials by using an action model, which can be learned on-line. With a model, an agent can mentally experience the effects of its actions without actually executing them. Experience replay is a simple technique that implements this idea, and is shown to be effective in reducing the number of action executions required. (3) Reinforcement learning agents can take advantage of instructive training instances provided by human teachers, resulting in a significant learning speedup. Teaching can also help learning agents avoid local optima during the search for optimal control. Simulation experiments indicate that even a small amount of teaching can save agents many learning trials. (4) Reinforcement learning agents can significantly reduce learning time by hierarchical learning--they first solve elementary learning problems and then combine solutions to the elementary problems to solve a complex problem. Simulation experiments indicate that a robot with hierarchical learning can solve a complex problem, which otherwise is hardly solvable within a reasonable time. (5) Reinforcement learning agents can deal with a wide range of non-Markovian environments by having a memory of their past. Three memory architectures are discussed. They work reasonably well for a variety of simple problems. One of them is also successfully applied to a nontrivial non-Markovian robot task.;The results of this dissertation rely on computer simulation, including (1) an agent operating in a dynamic and hostile environment and (2) a mobile robot operating in a noisy and non-Markovian environment. The robot simulator is physically realistic. This dissertation concludes that it is possible to build artificial agents than can acquire complex control policies effectively by reinforcement learning.
机译:强化学习代理具有适应性,反应性和自我监督能力。本文的目的是扩展强化学习的技术水平,并使之应用于复杂的机器人学习问题。特别是,它着重于两个问题。首先,很难从稀疏和延迟的信号中学习,并且通常是一个缓慢的过程。必须设计减少学习时间的技术。其次,大多数现有的强化学习方法都假设世界是马尔可夫决策过程。该假设对于许多感兴趣的机器人任务来说太强了。;本论文演示了我们如何能够克服慢学习问题并解决非马尔可夫环境,从而使强化学习对于现实的机器人任务更加实用:(1)强化学习自然而然与人工神经网络集成以获得高质量的概括,从而显着提高了学习速度。本文使用神经网络,即使在存在噪声以及大量二进制和实数值输入的情况下,它们也可以有效地泛化。 (2)增强型学习代理可以通过使用可以在线学习的动作模型来节省许多学习试验。使用模型,代理可以从心理上体验其动作的效果,而无需实际执行。体验重播是实现此想法的一种简单技术,并且可以有效地减少所需的动作执行次数。 (3)强化学习代理可以利用人类教师提供的指导性培训实例,从而显着提高学习速度。教学还可以帮助学习者在寻求最佳控制的过程中避免局部最优。模拟实验表明,即使是少量的教学也可以节省代理商很多的学习尝试。 (4)强化学习代理可以通过分层学习显着减少学习时间-他们首先解决基本学习问题,然后将基本问题的解决方案组合起来以解决复杂的问题。仿真实验表明,具有分层学习能力的机器人可以解决复杂的问题,否则很难在合理的时间内解决。 (5)强化学习代理可以通过记忆自己的过去来应对各种非马尔可夫环境。讨论了三种内存架构。它们对于各种简单的问题都相当有效。其中之一也成功地应用于非平凡的非马尔可夫机器人任务。本文的结果依赖于计算机仿真,包括(1)在动态和敌对环境中运行的代理,以及(2)在动态和敌对环境中运行的移动机器人。嘈杂的非马尔可夫环境。机器人模拟器在物理上是现实的。本文得出的结论是,通过加强学习,有可能构建出能够有效地获得复杂控制策略的人工代理。

著录项

  • 作者

    Lin, Long-Ji.;

  • 作者单位

    Carnegie Mellon University.;

  • 授予单位 Carnegie Mellon University.;
  • 学科 Artificial Intelligence.;Computer Science.
  • 学位 Ph.D.
  • 年度 1992
  • 页码 160 p.
  • 总页数 160
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号