Acceleration of Game Learning with Prediction-Based Reinforcement Learning -Toward the Emergence of Planning Behavior

机译：借助基于预测的强化学习来加速游戏学习-走向计划行为

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

When humans solve a problem, it is unlikely that they use only the current state of the problem to decide upon an action. It is difficult to explain the human action decision strategy by means of the state to action model, which is the major method used in conventional reinforcement learning (RL). On the contrary, humans appear to predict a future state through the use of past experience and decide upon an action based on that predicted state. In this paper, we propose a prediction-based RL model (PRLmodel). In the PRL model, a state prediction module and an action memory module are added to an actor-critic type RL, and the system predicts and evaluates a future state from a current one based on an expected value table. Then, the system chooses a point of action decision in order to perform the appropriate action. To evaluate the proposed model, we perform a computer simulation using a simple ping pong game. We also discuss the possibility that the PRL model may represent an evolutional change in conventional RL as well as a step toward modeling of hmuan planning behavior, because state prediction and its evaluation are the basic elements of planning in symbolic AI.

机译：当人类解决问题时，他们不太可能仅使用问题的当前状态来决定某个动作。很难通过状态到行为模型来解释人类行为决策策略，这是常规强化学习（RL）中使用的主要方法。相反，人类似乎可以通过使用过去的经验来预测未来的状态，并根据该预测的状态来决定某个动作。在本文中，我们提出了一种基于预测的RL模型（PRLmodel）。在PRL模型中，将状态预测模块和动作存储模块添加到行为者批判类型RL，并且系统根据期望值表从当前状态预测并评估未来状态。然后，系统选择一个动作点决策以执行适当的动作。为了评估提出的模型，我们使用一个简单的乒乓球游戏进行了计算机仿真。我们还讨论了PRL模型可能代表传统RL的演变变化以及向hmuan规划行为建模的步骤的可能性，因为状态预测及其评估是符号AI中规划的基本要素。

著录项

来源
《Joint International Conference on Artificial Neural Networks and Neural Information Processing - ICANN/ICONIP 2003 Jun 26-29, 2003 Istanbul, Turkey》|2003年|p.786-793|共8页
会议地点 Istanbul(TR);Istanbul(TR);Istanbul(TR);Istanbul(TR)
作者
Yu Ohigashi; Takashi Omori; Koji Morikawa; Natsuki Oka;
展开▼
作者单位

Graduate School of Engineering, Hokkaido University, Kita 13 jyou Nishi 8 chome, Kita, Sapporo, Hokkaido, 060-8628, Japan;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术;
关键词

相似文献

外文文献
中文文献
专利

1. Acceleration of game learning with prediction-based reinforcement learning - toward the emergence of planning behavior [J] . Yu Ohigashi, Takashi Omori, Koji Morikawa, 電子情報通信学会技術研究報告. ニュ-ロコンピュ-ティング. Neurocomputing . 2002,第627期

机译：通过基于预测的强化学习来加速游戏学习-朝计划行为的方向发展
2. Acceleration of game learning with prediction-based reinforcement learning - toward the emergence of planning behavior [J] . Yu Ohigashi, Takashi Omori, Koji Morikawa, 電子情報通信学会技術研究報告. ニュ-ロコンピュ-ティング. Neurocomputing . 2002,第627期

机译：基于预测的加强学习的游戏学习加速 - 朝向规划行为的出现
3. Emergence of anti-coordination through reinforcement learning in generalized minority games [J] . Chakrabarti Anindya S., Ghosh Diptesh Journal of Economic Interaction and Coordination . 2019,第2期

机译：通过强化学习在广义少数民族游戏中出现反协调现象
4. Acceleration of Game Learning with Prediction-Based Reinforcement Learning - Toward the Emergence of Planning Behavior [C] . Yu Ohigashi, Takashi Omori, Koji Morikawa, International Conference on Artificial Neural Networks . 2003

机译：基于预测的加固学习的游戏学习加速 - 朝向规划行为的出现
5. On Deep Reinforcement Learning for Games: Generalization of Deep Q-Learning with Multiple Policy Heads [D] . Boucher, Mathieu. 2020

机译：关于游戏的深度加固学习：多重政策头部深度Q学的泛化
6. How much of reinforcement learning is working memory not reinforcement learning? A behavioral computational and neurogenetic analysis [O] . Anne G. E. Collins, Michael J. Frank -1

机译：钢筋学习多少是工作记忆而不是加强学习？行为计算和神经肝分析
7. Decision Making Based on Reinforcement Learning and Emotion Learning: Emergence of Cooperative Behavior with Selfish Judgment [O] . 三澤秀明, 松田充史, 堀尾恵一 2012

机译：基于强化学习和情感学习的决策：带有自私判断力的合作行为的出现
8. Predicting Pilot Behavior in Medium Scale Scenarios Using Game Theory and Reinforcement Learning. [R] . Yildiz, Y., Agogino, A., Brat, G. 2013

机译：利用博弈论和强化学习预测中等规模情景中的飞行员行为。

Acceleration of Game Learning with Prediction-Based Reinforcement Learning -Toward the Emergence of Planning Behavior

摘要

著录项

相似文献

相关主题

期刊订阅