Policy gradient fuzzy reinforcement learning

机译：政策梯度模糊钢筋学习

获取原文

获取外文期刊封面目录资料

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

This work presents a new approach for tuning conclusions of fuzzy rules based on reinforcement learning. Unlike the most of existing fuzzy reinforcement learning algorithms, which are based on value function, while our approach called policy gradient fuzzy reinforcement learning (PGFRL) bases on gradient estimate. In PGFRL, the algorithm GPOMDP is employed to estimate the performance gradient with respect to the parameters of fuzzy rules. In our work we prove the convergence of fuzzy rules' parameters to a local optimum given necessary conditions. The experiment results show the effectiveness of PGFRL.

机译：这项工作提出了一种新方法，可根据加固学习进行模糊规则的结论。与基于价值函数的最大现有的模糊增强学习算法不同，而我们的方法称为政策梯度模糊增强学习（PGFRL）基于梯度估计。在PGFR1中，使用算法GPOMDP来估计关于模糊规则参数的性能梯度。在我们的工作中，我们将模糊规则的参数的融合证明了对当地最佳的必要条件。实验结果表明了PGFRL的有效性。

著录项

来源
《Machine Learning and Cybernetics》|2004年||共4页
会议地点
作者
Xue-Ning Wang; Xin Xu; Han-Gen He;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动化技术、计算机技术;
关键词
fuzzy control; learning (artificial intelligence); gradient methods; fuzzy rules; fuzzy control; policy gradient fuzzy reinforcement learning; gradient estimate;

机译：模糊控制;学习（人工智能）;梯度方法;模糊规则;模糊控制;政策梯度模糊钢筋学习;梯度估计;

相似文献

外文文献
中文文献
专利

1. An Algorithm of Policy Gradient Reinforcement Learning with a Fuzzy Controller in Policies [J] . Harukazu Igarashi, Seiji Ishihara International Journal of Artificial Intelligence and Expert Systems (IJAE) . 2013,第1期

机译：策略中带有模糊控制器的策略梯度强化学习算法
2. Deep reinforcement learning collision avoidance using policy gradient optimisation and Q-learning [J] . Shady A. Maged, Bishoy H. Mikhail International journal of computational vision and robotics . 2020,第3期

机译：使用政策梯度优化和Q-Learning避免深增强学习碰撞
3. Learning rate free reinforcement learning for real-time motion control using a value-gradient based policy [J] . van Rooijen J. C., Grondman I., Babuska R. Mechatronics: The Science of Intelligent Machines . 2014,第8期

机译：使用基于价值梯度的策略进行实时运动控制的无学习率强化学习
4. Robot reinforcement learning accuracy-based learning classifier systems with Fuzzy Policy Gradient descent(XCS-FPGRL) [C] . Jie SHAO, Jingru YU International Conference on Advances in Mechanical Engineering and Industrial . 2015

机译：基于机器人加强学习精确的基于学习分类器系统，具有模糊政策梯度下降（XCS-FPGR1）
5. Learning in Pursuit-Evasion Differential Games Using Reinforcement Fuzzy Learning. [D] . Al Faiya, Badr. 2012

机译：使用强化模糊学习在追逃性差分游戏中学习。
6. Correction: Spike-Based Reinforcement Learning in Continuous State and Action Space: When Policy Gradient Methods Fail [O] . Eleni Vasilaki, Nicolas Frémaux, Robert Urbanczik, 2009

机译：更正：在连续状态和动作空间中基于峰值的强化学习：当策略梯度方法失败时
7. Policy Gradient Reinforcement Learning with a Fuzzy Controller for Policy: Decision Making in RoboCup Soccer Small Size League [O] . Masaya SUGIMOTO, Harukazu IGARASHI, Seiji ISHIHARA, 2014

机译：政策模糊控制器的政策梯度加固学习：Robocup足球小型联赛中的决策

Policy gradient fuzzy reinforcement learning

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅