首页> 外文期刊>電子情報通信学会技術研究報告. ニュ-ロコンピュ-ティング. Neurocomputing >Acceleration of game learning with prediction-based reinforcement learning - toward the emergence of planning behavior
【24h】

Acceleration of game learning with prediction-based reinforcement learning - toward the emergence of planning behavior

机译:基于预测的加强学习的游戏学习加速 - 朝向规划行为的出现

获取原文
获取原文并翻译 | 示例
           

摘要

When humans solve a problem, it is unlikely that they use only the current state of the problem to decide upon an action. It is difficult to explain the human action decision strategy by means of the state to action model, which is the major method used in conventional reinforcement learning (RL). On the contrary, humans appear to predict a future state through the use of past experience and decide upon an action based on that predicted state. In this paper, we propose a prediction-based RL model (PRLmodel). In the PRL model, a state prediction module and an action memory module are added to an actor-critic type RL, and the system predicts and evaluates a future state from a current one based on an expected value table. Then, the system chooses a point of action decision in order to perform the appropriate action. To evaluate the proposed model, we perform a computer simulation using a simple ping pong game.
机译:当人类解决问题时,它们不太可能仅使用问题的当前状态来决定动作。 难以通过国家行动模型解释人体行动决策策略,这是传统增强学习(RL)中使用的主要方法。 相反,人类似乎通过使用过去的经验来预测未来的状态,并根据该预测状态的行动来决定。 在本文中,我们提出了一种基于预测的RL模型(PRLModel)。 在PRL模型中,状态预测模块和动作存储器模块被添加到actor-批评型RL,并且系统基于预期值表预测和评估来自当前的状态。 然后,系统选择一个行动决定,以便执行适当的动作。 为了评估所提出的模型,我们使用简单的Ping Pong Game进行计算机仿真。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号