首页> 外文学位 >Reinforcement learning in environments with independent delayed-sense dynamics.

【24h】

Reinforcement learning in environments with independent delayed-sense dynamics.

机译：在具有独立延迟感知动态的环境中进行强化学习。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This thesis is a detailed investigation into applying reinforcement learning to environments with independent delayed-sense dynamics (IDSD), where some of state variables evolve independently of both agent's actions and other state variables, and can be sensed only after a delay. These independent state variables are analogous to disturbances, since they are independent of control actions and are not observable before the agent commits a course of action.;In this thesis, we first formalize IDSD problems and then develop four reinforcement learning algorithms that exploit the structure of IDSD problems to achieve better efficiency. Two of the algorithms are partially model-based and two are model-free. We discuss that for the same amount of experiments the quality of the policy learned by the proposed algorithms is better than that of learned by conventional reinforcement learning algorithms.;We demonstrate the effectiveness of our algorithms by applying them to traffic grid-world problems and to a hybrid vehicle problem, in which the traffic and driver acceleration play the role of the independent state variable respectively. We show experimentally that our algorithms evaluate a given policy more accurately than the corresponding TD(0). We also show that in the case of control, the learning speeds of our algorithms are substantially higher than the learning speed of conventional reinforcement learning algorithms that do not use the knowledge of the IDSD structure.

机译：本文是对将强化学习应用于具有独立延迟感官动力学（IDSD）的环境的详细研究，其中某些状态变量独立于主体的动作和其他状态变量而演化，并且只有在延迟之后才能被感知。这些独立的状态变量类似于扰动，因为它们独立于控制动作，并且在主体执行动作过程之前是不可观察的。；在本文中，我们首先将IDSD问题形式化，然后开发四种利用该结构的强化学习算法IDSD问题以达到更好的效率。其中两种算法是部分基于模型的，而两种是无模型的。我们讨论了在相同数量的实验中，所提算法学习的策略的质量要优于传统强化学习算法所学习的策略。；我们通过将算法应用于交通网格世界问题以及对混合动力车辆问题，其中交通和驾驶员加速分别起独立状态变量的作用。我们通过实验证明，与对应的TD（0）相比，我们的算法对给定策略的评估更为准确。我们还表明，在控制的情况下，我们算法的学习速度大大高于不使用IDSD结构知识的常规强化学习算法的学习速度。

著录项

作者
Shahamiri, Masoud.;
展开▼
作者单位

University of Alberta (Canada).;

展开▼
授予单位 University of Alberta (Canada).;
学科 Computer Science.
学位 M.Sc.
年度 2008
页码 54 p.
总页数 54
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Dissociating the contributions of independent corticostriatal systems to visual categorization learning through the use of reinforcement learning modeling and Granger causality modeling. [J] . Seger CA, Peterson EJ, Cincotta CM, NeuroImage . 2010,第2期

机译：通过使用强化学习模型和Granger因果关系模型，分离独立的皮质口系统对视觉分类学习的贡献。
2. Reinforcement learning in dynamic environment: abstraction of state-action space utilizing properties of the robot body and environment [J] . Kazuyuki Ito, Yutaka Takeuchi Artificial life and robotics . 2016,第1期

机译：动态环境中的强化学习：利用机器人身体和环境的属性抽象状态作用空间
3. Reinforcement learning for dynamic environment: a classification of dynamic environments and a detection method of environmental changes [J] . Masato Nagayoshi, Hajime Murao, H. Tamaki Artificial life and robotics . 2013,第1a2期

机译：动态环境的强化学习：动态环境的分类和环境变化的检测方法
4. Suggestion of probabilistic reward-independent knowledge for dynamic environment in reinforcement learning [C] . Shibuya Nodoka, Miyazaki Yoshiki, Kurashige Kentarou 2011 International Symposium on Micro-NanoMechatronics and Human Science . 2011

机译：强化学习中动态环境中概率奖励无关知识的建议
5. Reinforcement learning control with approximation of time-dependent agent dynamics. [D] . Kirkpatrick, Kenton Conrad. 2013

机译：强化学习控制，与时间相关的代理动态近似。
6. Dissociating the Contributions of Independent Corticostriatal Systems to Visual Categorization Learning Through the Use of Reinforcement Learning Modeling and Granger Causality Modeling [O] . Carol A. Seger, Erik J. Peterson, Corinna M. Cincotta, -1

机译：解离独立的皮质纹状体系统到Visual分类学的贡献通过强化学习模型和格兰杰因果关系模型的使用
7. Dissociating the contributions of independent corticostriatal systems to visual categorization learning through the use of reinforcement learning modeling and Granger causality modeling [O] . Carol A. Seger, Erik J. Peterson, Corinna M. Cincotta, 2010

机译：通过使用加强学习建模和格兰杰因果关系来解开独立皮质棘轮系统对视觉分类学习的贡献

Reinforcement learning in environments with independent delayed-sense dynamics.

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅