首页> 美国政府科技报告 >Improving Reinforcement Learning Rates Using Prior Knowledge
【24h】

Improving Reinforcement Learning Rates Using Prior Knowledge

机译:利用先验知识提高强化学习率

获取原文

摘要

This paper presents a new technique called Prior Policy Fallback (PPF) which provides a simple means of incorporating prior knowledge into a reinforcement learning system based on Q-learning. PPF performs significantly better than an intuitively similar method based on selectively initializing values of the Q table. The benefits of PPF occur because PPF does not adversely interact with the normal Q-learning update mechanisms. PPF simply accelerates the time to reward during initial trials and then gradually becomes less involved as normal Q-learning updates occur. PPF provides an alternative to manual teaching methods for accelerating learning during early trials and can be used in conjunction with many of the existing methods for accelerating Q-learning updates. The benefits of PPF are illustrated through several experiments based on a static grid-based world.

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号