首页> 外文期刊>Fuzzy sets and systems >Reinforcement Distribution In Fuzzy Q-learning
【24h】

Reinforcement Distribution In Fuzzy Q-learning

机译:模糊Q学习中的钢筋分布

获取原文
获取原文并翻译 | 示例
           

摘要

Q-learning is one of the most popular reinforcement learning methods that allows an agent to learn the relationship between interval-valued state and action spaces, through a direct interaction with the environment. Fuzzy Q-learning is an extension to this algorithm to enable it to evolve fuzzy inference systems (FIS) which range on continuous state and action spaces. In a FIS, the interaction among fuzzy rules plays a primary role to achieve good performance and robustness. Learning a system where this interaction is present gives to the learning mechanism problems due to eventually incoherent reinforcements coming to the same rule due to its interaction with other rules. In this paper, we will introduce different strategies to distribute reinforcement to reduce this undesired effect and to stabilize the obtained reinforcement. In particular, we will present two strategies: the former focuses on rewarding the actions chosen by each rule during the cooperation phase, the latter on rewarding the rules presenting actions closer to those actually executed rather than the rules that contributed to generate such actions.
机译:Q学习是最流行的强化学习方法之一,它允许代理通过与环境的直接交互来学习区间值状态与动作空间之间的关系。模糊Q学习是该算法的扩展,使它能够发展范围连续的状态和动作空间的模糊推理系统(FIS)。在FIS中,模糊规则之间的交互起着重要作用,以实现良好的性能和鲁棒性。学习存在这种交互作用的系统会给学习机制带来问题,这是由于最终不连贯的增强因与其他规则的交互作用而变为同一规则。在本文中,我们将介绍不同的策略来分配钢筋,以减少这种不良影响并稳定获得的钢筋。特别是,我们将提出两种策略:前者侧重于奖励在合作阶段中每个规则选择的动作,后者侧重于奖励那些表示行为更接近实际执行的规则的规则,而不是那些有助于产生此类行为的规则。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号