首页> 外文期刊>International journal of modeling, simulation and scientific computing >Hierarchical fuzzy ART for Q-learning and its application in air combat simulation
【24h】

Hierarchical fuzzy ART for Q-learning and its application in air combat simulation

机译:层次模糊ART的Q学习及其在空战仿真中的应用。

获取原文
获取原文并翻译 | 示例
       

摘要

Value function approximation plays an important role in reinforcement learning (RL) with continuous state space, which is widely used to build decision models in practice. Many traditional approaches require experienced designers to manually specify the formulization of the approximating function, leading to the rigid, non-adaptive representation of the value function. To address this problem, a novel Q-value function approximation method named 'Hierarchical fuzzy Adaptive Resonance Theory' (HiART) is proposed in this paper. HiART is based on the Fuzzy ART method and is an adaptive classification network that learns to segment the state space by classifying the training input automatically. HiART begins with a highly generalized structure where the number of the category nodes is limited, which is beneficial to speed up the learning process at the early stage. Then, the network is refined gradually by creating the attached subnetworks, and a layered network structure is formed during this process. Based on this adaptive structure, HiART alleviates the dependence on expert experience to design the network parameter. The effectiveness and adaptivity of HiART are demonstrated in the Mountain Car benchmark problem with both fast learning speed and low computation time. Finally, a simulation application example of the one versus one air combat decision problem illustrates the applicability of HiART.
机译:值函数逼近在具有连续状态空间的强化学习(RL)中起着重要作用,它在实践中广泛用于建立决策模型。许多传统方法要求经验丰富的设计人员手动指定近似函数的公式,从而导致价值函数的刚性,非自适应表示。为了解决这个问题,本文提出了一种新颖的Q值函数逼近方法,称为“层次模糊自适应共振理论”(HiART)。 HiART基于Fuzzy ART方法,是一种自适应分类网络,通过自动分类训练输入来学习对状态空间进行分段。 HiART从高度通用的结构开始,其中类别节点的数量受到限制,这有利于加快早期的学习过程。然后,通过创建连接的子网逐渐完善网络,并在此过程中形成分层的网络结构。基于这种自适应结构,HiART减轻了对专家经验的依赖,以设计网络参数。 HiART的有效性和适应性在山地车基准测试问题中得到了证明,具有学习速度快和计算时间短的特点。最后,一个空战决策问题的仿真应用示例说明了HiART的适用性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号