首页> 外文会议>IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning >A two stage learning technique for dual learning in the pursuit-evasion differential game

【24h】

A two stage learning technique for dual learning in the pursuit-evasion differential game

机译：追求逃避差异游戏中的双重学习两级学习技术

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper addresses the case of dual learning in the pursuit-evasion (PE) differential game and examines how fast the players can learn their default control strategies. The players should learn their default control strategies simultaneously by interacting with each other. Each player's learning process depends on the rewards received from its environment. The learning process is implemented using a two stage learning algorithm that combines the particle swarm optimization (PSO)-based fuzzy logic control (FLC) algorithm with the Q-Learning fuzzy inference system (QFIS) algorithm. The PSO algorithm is used as a global optimizer to autonomously tune the parameters of a fuzzy logic controller whereas the QFIS algorithm is used as a local optimizer. The two stage learning algorithm is compared through simulation with the default control strategy, the PSO-based FLC algorithm, and the QFIS algorithm. Simulation results show that the players are able to learn their default control strategies. Also, it shows that the two stage learning algorithm outperforms the PSO-based FLC algorithm and the QFIS algorithm with respect to the learning time.

机译：本文在追求逃避（PE）差异游戏中，对双重学习进行了解决，并检查玩家可以学习其默认控制策略的快速。玩家应该通过互相交互同时学习其默认控制策略。每个玩家的学习过程都取决于从其环境收到的奖励。使用两阶段学习算法来实现学习过程，该算法将粒子群优化（PSO）的模糊逻辑控制（FLC）算法与Q学习模糊推理系统（QFIS）算法组合。 PSO算法用作全局优化器，以自主调整模糊逻辑控制器的参数，而QFIS算法用作本地优化器。通过使用默认控制策略，基于PSO的FLC算法和QFIS算法进行仿真比较了两个阶段学习算法。仿真结果表明，玩家能够学习其默认控制策略。此外，它表明，两个阶段学习算法优于基于PSO的FLC算法和QFIS算法的学习时间。

著录项

来源
《IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning 》|2014年||共8页
会议地点
作者
Al-Talabi Ahmad /A/.; Schwartz Howard M.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP311.5-53;
关键词
control system analysis computing; fuzzy control; fuzzy reasoning; game theory; learning (artificial intelligence); particle swarm optimisation; FLC; PE; PSO; Q-Learning fuzzy inference system algorithm; QFIS; default control strategies; dual learning; fuzzy logic controller; global optimizer; particle swarm optimization based fuzzy logic control algorithm; pursuit-evasion differential game; two stage learning technique; Approximation algorithms; Fuzzy logic; Games; Inference algorithms; Sociology; Statistics; Tuning;

机译：控制系统的分析计算;模糊控制;模糊推理;博弈论;学习（人工智能）;粒子群优化;FLC;PE;PSO;Q 学习模糊推理系统算法;QFIS;默认控制策略;双学习;模糊逻辑控制器;全局优化;粒子群算法基于模糊逻辑控制算法;追逃微分博弈;两级学习技术;近似算法;模糊逻辑;游戏;推理算法;社会学;统计;调整;

相似文献

外文文献
中文文献
专利

1. A Decentralized Fuzzy Learning Algorithm for Pursuit-Evasion Differential Games with Superior Evaders [J] . Awheda Mostafa D., Schwartz Howard M. Journal of Intelligent & Robotic Systems: Theory & Application . 2016 ,第1期

机译：具有高级避让者的追逃性微分游戏的分散模糊学习算法
2. Self-learning fuzzy logic controllers for pursuit-evasion differential games [J] . Sameh F. Desouky, Howard M. Schwartz Robotics and Autonomous Systems . 2011 ,第1期

机译：自学模糊逻辑控制器，用于逃避追赶差分博弈
3. Q(λ)-learning adaptive fuzzy logic controllers for pursuit-evasion differential games [J] . Sameh F. Desouky, Howard M. Schwartz International Journal of Adaptive Control and Signal Processing . 2011 ,第10期

机译：Q（λ）学习型逃避差分博弈自适应模糊控制器
4. A two stage learning technique for dual learning in the pursuit-evasion differential game [C] . Al-Talabi Ahmad /A/., Schwartz Howard M. IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning . 2014

机译：追逃微分游戏中双重学习的两阶段学习技术
5. Learning in Pursuit-Evasion Differential Games Using Reinforcement Fuzzy Learning. [D] . Al Faiya, Badr. 2012

机译：使用强化模糊学习在追逃性差分游戏中学习。
6. An sEMG-Controlled 3D Game for Rehabilitation Therapies: Real-Time Time Hand Gesture Recognition Using Deep Learning Techniques [O] . Nadia Nasri, Sergio Orts-Escolano, Miguel Cazorla 2020

机译：用于康复疗法的SEMG控制的3D游戏：使用深度学习技术实时手势识别
7. Learning and design of fuzzy logic controllers for pursuit-evasion differential games [O] . Sameh Desouky -1

机译：追求逃避差异游戏模糊逻辑控制器的学习与设计
8. Predicting What Reinforcement Learning Will Tell You: A Model of Human Decision-Making in Multi-Stage Games. [R] . B. Tracey D. H. Wolpert J. Bono R. Lee R. W. Bent S. N. Backhaus 2011

机译：预测强化学习将告诉你什么：多阶段博弈中的人类决策模型。

A two stage learning technique for dual learning in the pursuit-evasion differential game

摘要

著录项

相似文献

相关主题

期刊订阅