首页> 外文会议>SICE Annual Conference 2011 : Final program and abstracts >A study on use of prior information for acceleration of reinforcement learning
【24h】

A study on use of prior information for acceleration of reinforcement learning

机译:利用先验信息促进强化学习的研究

获取原文

摘要

Reinforcement learning is a method with which an agent learns appropriate response for solving problems by trial-and-error. The advantage is that reinforcement learning can be applied to unknown or uncertain problems. But instead, there is a drawback that this method needs a long time to solve the problem because of trial-and-error. If there is prior information about the environment, some of trial-and-error can be spared and the learning can take a shorter time. The prior information provided by a human designer can be wrong because of uncertainties in the problems. If the wrong prior information is used, there can be bad effects such as failure to get the optimal policy and slowing down of reinforcement learning. We propose to control use of the prior information to suppress the bad effects. The agent forgets the prior information gradually by multiplying a forgetting factor while it learns the better policy. We apply the proposed method to a couple of testbed environments and a number of types of prior information. The method shows the good results in terms of both the learning speed and the quality of obtained policies.
机译:强化学习是一种方法,代理可以通过这种方法学习适当的响应,以通过试错法解决问题。优点是可以将强化学习应用于未知或不确定的问题。但是相反,存在一个缺点,即由于反复试验,该方法需要很长时间来解决该问题。如果有关于环境的事先信息,则可以省去一些反复试验,并且学习可以花费更短的时间。由于问题的不确定性,人类设计师提供的先前信息可能是错误的。如果使用了错误的先验信息,则可能会产生不良影响,例如无法获得最佳策略并减慢强化学习的速度。我们建议控制先验信息的使用以抑制不良影响。代理在学习更好的策略时会通过增加遗忘因子来逐渐遗忘先验信息。我们将提出的方法应用于几个测试平台环境和许多类型的先验信息。该方法在学习速度和所获得策略的质量方面均显示出良好的效果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号