An Adaptive Strategy Model for Opponent's Characteristics based on Reinforcement Learning

Masahiro Ono; Mitsuru Shiozaki; Mamoru Sasaki; Atsushi Iwata

首页> 外文期刊>電子情報通信学会技術研究報告. ニュ-ロコンピュ-ティング. Neurocomputing >An Adaptive Strategy Model for Opponent's Characteristics based on Reinforcement Learning

【24h】

An Adaptive Strategy Model for Opponent's Characteristics based on Reinforcement Learning

机译：基于强化学习的对手特征自适应策略模型

获取原文

获取原文并翻译 | 示例

获取外文期刊封面目录资料

开具论文收录证明 >>

文献代查 >>

文献数据库（团队版） >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In order to create a robot brain having intelligent action strategies, we proposed a model for making strategy for winning a game. During a game, It can make several strategies, and adaptively select/switch them to opponent feature change. For strategy making algorithm, Q-PSP reinforced learning are used because of faster learning speed. Selection and switching of the formed strategies are done based on the similarity between two kinds of Q-functions: (1) Q{sub}x is obtained at each strategy learning, and (2) Q{sub}m is used to recognize features of an opponent. We made a simulation program for an air hockey game based on the proposed strategy model. As the results of simulation, we confirmed the operations of strategy making and selection/switching, and evaluate the effectiveness of the proposed model.

机译：为了创建具有智能行动策略的机器人大脑，我们提出了一种制定胜利奖励游戏的模型。在游戏期间，它可以进行多种策略，并自适应地选择/切换到对手功能改变。对于策略制作算法，由于学习速度更快地使用Q-PSP增强学习。选择和切换所形成的策略是基于两种Q函数之间的相似性完成：（1）在每个策略学习中获得的Q {Sub} x，并且（2）Q {Sub} M用于识别功能对手。我们为基于所提出的策略模型进行了潮流游戏进行了仿真程序。作为仿真结果，我们确认了战略制作和选择/切换的操作，并评估了所提出的模型的有效性。

著录项

来源
《電子情報通信学会技術研究報告. ニュ-ロコンピュ-ティング. Neurocomputing》 |2003年第228期|共6页
作者
Masahiro Ono; Mitsuru Shiozaki; Mamoru Sasaki; Atsushi Iwata;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 jpn
中图分类人工智能理论;
关键词
Brain of robot; Strategy model; Reinforcement learning; Q-function; Strategy making; Strategy selecting/switching;

机译：机器人的大脑;策略模型;加固学习;Q函数;策略制作;策略选择/切换;

相似文献

外文文献
中文文献
专利

1. An Adaptive Strategy Model for Opponent's Characteristics based on Reinforcement Learning [J] . Masahiro Ono, Mitsuru Shiozaki, Mamoru Sasaki, 電子情報通信学会技術研究報告. ニュ-ロコンピュ-ティング. Neurocomputing . 2003,第228期

机译：基于强化学习的对手特征自适应策略模型
2. Comparing strategies for modeling students learning styles through reinforcement learning in adaptive and intelligent educational systems:An experimental analysis [J] . Fabiano A. Dorca, Luciano V. Lima, Marcia A. Fernandes, Expert Systems with Application . 2013,第6期

机译：在自适应和智能教育系统中通过强化学习对学生学习风格进行建模的策略比较：实验分析
3. Adaptive automata model for learning opponent behavior based on genetic algorithms [J] . Sally Almanasra, Khaled Suwais, Muhammad Rafie Arshad Scientific Research and Essays . 2012,第42期

机译：基于遗传算法的学习对手行为的自适应自动机模型
4. Adapting Strategies to Opponent Models in Incomplete Information Games: A Reinforcement Learning Approach for Poker [C] . Luis Filipe Teofilo, Nuno Passos, Luis Paulo Reis, Autonomous and intelligent systems . 2012

机译：不完全信息游戏中适应对手模型的策略：扑克的强化学习方法
5. Reinforcement learning based strategies for adaptive wireless sensor network management. [D] . Shah, Kunalbhai. 2010

机译：基于增强学习的自适应无线传感器网络管理策略。
6. Extraversion differentiates between model-based and model-free strategies in a reinforcement learning task [O] . Anya Skatova, Patricia A. Chan, Nathaniel D. Daw 2013

机译：在强化学习任务中外向性区分基于模型的策略和没有模型的策略
7. Opponent Actor Learning (OpAL): Modeling Interactive Effects of Striatal Dopamine on Reinforcement Learning and Choice Incentive [O] . Anne G. E. Collins, Michael J. Frank 2015

机译：对手演员学习（OpaL）：模仿纹状体多巴胺对强化学习和选择激励的交互作用

An Adaptive Strategy Model for Opponent's Characteristics based on Reinforcement Learning

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅