首页> 外文会议>SICE Annual Conference >A study on use of prior information for acceleration of reinforcement learning
【24h】

A study on use of prior information for acceleration of reinforcement learning

机译:加强加强学习信息的使用研究

获取原文

摘要

Reinforcement learning is a method with which an agent learns appropriate response for solving problems by trial-and-error. The advantage is that reinforcement learning can be applied to unknown or uncertain problems. But instead, there is a drawback that this method needs a long time to solve the problem because of trial-and-error. If there is prior information about the environment, some of trial-and-error can be spared and the learning can take a shorter time. The prior information provided by a human designer can be wrong because of uncertainties in the problems. If the wrong prior information is used, there can be bad effects such as failure to get the optimal policy and slowing down of reinforcement learning. We propose to control use of the prior information to suppress the bad effects. The agent forgets the prior information gradually by multiplying a forgetting factor while it learns the better policy. We apply the proposed method to a couple of testbed environments and a number of types of prior information. The method shows the good results in terms of both the learning speed and the quality of obtained policies.
机译:强化学习是一种方法,代理通过试验和错误来解决问题的适当响应。优点是增强学习可以应用于未知或不确定的问题。但相反,由于试验和错误,此方法需要很长时间来解决问题的缺点。如果有关于环境的先前信息,则可以省一些试验和错误,并且学习可能需要更短的时间。由于问题的不确定性,人类设计师提供的先前信息可能是错误的。如果使用错误的先前信息,则可能存在不良效果,例如未能获得最佳政策和减速加固学习。我们建议控制先前信息来抑制不良影响。通过将遗忘因素逐渐逐渐逐渐忘记了先前的信息,而在学习更好的政策时逐渐逐渐逐步。我们将建议的方法应用于几种测试的环境和许多类型的先前信息。该方法在学习速度和获得的政策的质量方面表现出良好的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号