Acceleration of game learning with prediction-based reinforcement learning - toward the emergence of planning behavior

Yu Ohigashi; Takashi Omori; Koji Morikawa; Natsuki Oka

首页> 外文期刊>電子情報通信学会技術研究報告. ニュ-ロコンピュ-ティング. Neurocomputing >Acceleration of game learning with prediction-based reinforcement learning - toward the emergence of planning behavior

【24h】

Acceleration of game learning with prediction-based reinforcement learning - toward the emergence of planning behavior

机译：基于预测的加强学习的游戏学习加速 - 朝向规划行为的出现

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

When humans solve a problem, it is unlikely that they use only the current state of the problem to decide upon an action. It is difficult to explain the human action decision strategy by means of the state to action model, which is the major method used in conventional reinforcement learning (RL). On the contrary, humans appear to predict a future state through the use of past experience and decide upon an action based on that predicted state. In this paper, we propose a prediction-based RL model (PRLmodel). In the PRL model, a state prediction module and an action memory module are added to an actor-critic type RL, and the system predicts and evaluates a future state from a current one based on an expected value table. Then, the system chooses a point of action decision in order to perform the appropriate action. To evaluate the proposed model, we perform a computer simulation using a simple ping pong game.

机译：当人类解决问题时，它们不太可能仅使用问题的当前状态来决定动作。难以通过国家行动模型解释人体行动决策策略，这是传统增强学习（RL）中使用的主要方法。相反，人类似乎通过使用过去的经验来预测未来的状态，并根据该预测状态的行动来决定。在本文中，我们提出了一种基于预测的RL模型（PRLModel）。在PRL模型中，状态预测模块和动作存储器模块被添加到actor-批评型RL，并且系统基于预期值表预测和评估来自当前的状态。然后，系统选择一个行动决定，以便执行适当的动作。为了评估所提出的模型，我们使用简单的Ping Pong Game进行计算机仿真。

著录项

来源
《電子情報通信学会技術研究報告. ニュ-ロコンピュ-ティング. Neurocomputing》 |2002年第627期|共6页
作者
Yu Ohigashi; Takashi Omori; Koji Morikawa; Natsuki Oka;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 jpn
中图分类人工智能理论;
关键词
Model based reinforcement learning; Planning; Prediction;

机译：基于模型的强化学习;规划;预测;

相似文献

外文文献
中文文献
专利

1. Acceleration of game learning with prediction-based reinforcement learning - toward the emergence of planning behavior [J] . Yu Ohigashi, Takashi Omori, Koji Morikawa, 電子情報通信学会技術研究報告. ニュ-ロコンピュ-ティング. Neurocomputing . 2002,第627期

机译：通过基于预测的强化学习来加速游戏学习-朝计划行为的方向发展
2. Emergence of anti-coordination through reinforcement learning in generalized minority games [J] . Chakrabarti Anindya S., Ghosh Diptesh Journal of Economic Interaction and Coordination . 2019,第2期

机译：通过强化学习在广义少数民族游戏中出现反协调现象
3. Emergence of anti-coordination through reinforcement learning in generalized minority games [J] . Chakrabarti Anindya S., Ghosh Diptesh Journal of Economic Interaction and Coordination . 2019,第2期

机译：通过广义少数民族游戏中的加强学习反协调的出现
4. Acceleration of Game Learning with Prediction-Based Reinforcement Learning -Toward the Emergence of Planning Behavior [C] . Yu Ohigashi, Takashi Omori, Koji Morikawa, Joint International Conference on Artificial Neural Networks and Neural Information Processing - ICANN/ICONIP 2003 Jun 26-29, 2003 Istanbul, Turkey . 2003

机译：借助基于预测的强化学习来加速游戏学习-走向计划行为
5. On Deep Reinforcement Learning for Games: Generalization of Deep Q-Learning with Multiple Policy Heads [D] . Boucher, Mathieu. 2020

机译：关于游戏的深度加固学习：多重政策头部深度Q学的泛化
6. How much of reinforcement learning is working memory not reinforcement learning? A behavioral computational and neurogenetic analysis [O] . Anne G. E. Collins, Michael J. Frank -1

机译：钢筋学习多少是工作记忆而不是加强学习？行为计算和神经肝分析
7. Decision Making Based on Reinforcement Learning and Emotion Learning: Emergence of Cooperative Behavior with Selfish Judgment [O] . 三澤秀明, 松田充史, 堀尾恵一 2012

机译：基于强化学习和情感学习的决策：带有自私判断力的合作行为的出现
8. Predicting Pilot Behavior in Medium Scale Scenarios Using Game Theory and Reinforcement Learning. [R] . Yildiz, Y., Agogino, A., Brat, G. 2013

机译：利用博弈论和强化学习预测中等规模情景中的飞行员行为。

Acceleration of game learning with prediction-based reinforcement learning - toward the emergence of planning behavior

摘要

著录项

相似文献

相关主题

期刊订阅