首页> 外文会议>EUROCON 2009, EUROCON '09 >Improving reinforcement learning using temporal-difference network EUROCON2009
【24h】

Improving reinforcement learning using temporal-difference network EUROCON2009

机译:使用时差网络EUROCON2009改进强化学习

获取原文
获取外文期刊封面目录资料

摘要

Reinforcement learning has been one of popular learning methods for many problems in many different domains. The important point for this method is how fast and efficient it is to learn a new problem. In this paper, we present a new approach to increase the efficiency of the reinforcement learning method with the great help of a predictive model of the problem's environment called temporal-difference network along with observation. This TD network is nourished with the knowledge extracted from another problem with the same task using TD network. First a reinforcement-learning agent tries to learn its environment for the task of wall following. After that we train temporal-difference network (TDN) with intervening observation in the brain of the agent in order to gain a predictive model of the environment. Later the most promising sequences of action-observation of the given environment will be extracted as knowledge to strengthen the reinforcement learning problem in a new environment. Finally this knowledge helps the reinforcement procedure to produce more efficient results.
机译:强化学习已经成为许多不同领域中许多问题的流行学习方法之一。这种方法的重点是学习新问题的速度和效率。在本文中,我们提出了一种新的方法来提高强化学习方法的效率,这是在问题环境的预测模型(称为时差网络和观察)的大力帮助下实现的。使用TD网络从具有相同任务的另一个问题中提取的知识来滋养此TD网络。首先,强化学习代理尝试学习其环境以完成墙面跟踪任务。之后,我们在中介人的大脑中进行中间观察训练时差网络(TDN),以获得环境的预测模型。稍后,将提取给定环境中最有希望的行动观察序列,作为在新环境中加强强化学习问题的知识。最后,这些知识有助于加固程序产生更有效的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号