A study on use of prior information for acceleration of reinforcement learning

机译：利用先验信息促进强化学习的研究

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Reinforcement learning is a method with which an agent learns appropriate response for solving problems by trial-and-error. The advantage is that reinforcement learning can be applied to unknown or uncertain problems. But instead, there is a drawback that this method needs a long time to solve the problem because of trial-and-error. If there is prior information about the environment, some of trial-and-error can be spared and the learning can take a shorter time. The prior information provided by a human designer can be wrong because of uncertainties in the problems. If the wrong prior information is used, there can be bad effects such as failure to get the optimal policy and slowing down of reinforcement learning. We propose to control use of the prior information to suppress the bad effects. The agent forgets the prior information gradually by multiplying a forgetting factor while it learns the better policy. We apply the proposed method to a couple of testbed environments and a number of types of prior information. The method shows the good results in terms of both the learning speed and the quality of obtained policies.

机译：强化学习是一种方法，代理可以通过这种方法学习适当的响应，以通过试错法解决问题。优点是可以将强化学习应用于未知或不确定的问题。但是相反，存在一个缺点，即由于反复试验，该方法需要很长时间来解决该问题。如果有关于环境的事先信息，则可以省去一些反复试验，并且学习可以花费更短的时间。由于问题的不确定性，人类设计师提供的先前信息可能是错误的。如果使用了错误的先验信息，则可能会产生不良影响，例如无法获得最佳策略并减慢强化学习的速度。我们建议控制先验信息的使用以抑制不良影响。代理在学习更好的策略时会通过增加遗忘因子来逐渐遗忘先验信息。我们将提出的方法应用于几个测试平台环境和许多类型的先验信息。该方法在学习速度和所获得策略的质量方面均显示出良好的效果。

著录项

来源
《SICE Annual Conference 2011 : Final program and abstracts》|2011年|p.537-543|共7页
会议地点
作者
Terashima Kento; Murata Junichi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动控制、自动控制系统;
关键词
exploring visit; forgetting factor; option; prior information; reinforcement learning;

机译：探访;遗忘因素;选择;现有信息;强化学习;

相似文献

外文文献
中文文献
专利

1. Acceleration of Reinforcement Learning with Incomplete Prior Information [J] . Kento Terashima, Hirotaka Takano, Junichi Murata Journal of Advanced Computatioanl Intelligence and Intelligent Informatics . 2013,第5a100期

机译：借助不完整的先验信息加速强化学习
2. Acceleration of Reinforcement Learning by Controlled Use of Options Given as Prior Information [J] . Kento TERASHIMA, Hirotaka TAKANO, Junichi MURATA SICE Journal of Control, Measurement, and System Integration (SICE JCMSI) . 2013,第4期

机译：通过控制使用作为先验信息给出的选项来加速强化学习
3. Convergence of reinforcement learning algorithms and acceleration of learning - art. no. 026706 [J] . Potapov A., Ali MK. Physical review, E. Statistical physics, plasmas, fluids, and related interdisciplinary topics . 2003,第2aPta2期

机译：强化学习算法的融合和学习加速。没有。 026706
4. A study on use of prior information for acceleration of reinforcement learning [C] . Terashima Kento, Murata Junichi SICE Annual Conference . 2011

机译：加强加强学习信息的使用研究
5. Model-based Bayesian Reinforcement Learning with Generalized Priors [D] . Asmuth, John Thomas 2013

机译：具有广义先验的基于模型的贝叶斯强化学习
6. Optimizing the Sensor Placement for Foot Plantar Center of Pressure without Prior Knowledge Using Deep Reinforcement Learning [O] . Cheng-Wu Lin, Shanq-Jang Ruan, Wei-Chun Hsu, 2020

机译：使用深度加强学习优化脚跖压力压力中心的传感器放置
7. Learning Transferable Domain Priors for Safe Exploration in Reinforcement Learning [O] . Thommen George Karimpanal, Santu Rana, Sunil Gupta, 2020

机译：学习可转让的域名前脚，以便在加固学习中安全探索

A study on use of prior information for acceleration of reinforcement learning

摘要

著录项

相似文献

相关主题

期刊订阅