...
首页> 外文期刊>Information Sciences: An International Journal >Reinforcement learning with automatic basis construction based on isometric feature mapping
【24h】

Reinforcement learning with automatic basis construction based on isometric feature mapping

机译:基于等距特征图的自动基础构建的强化学习

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Value function approximation (VFA) has been a major research topic in reinforcement learning. Although various reinforcement learning algorithms with VFA have been proposed, the performance of most previous algorithms depends on the predefined structure of the basis functions. To address this problem, this paper presents a novel basis learning method for VFA based on isometric feature mapping (IFM). In the proposed method, basis functions for VFA are automatically generated by constructing the optimal embedding basis of the data in a d-dimensional Euclidean space, which best preserves the estimated intrinsic geometry of the manifold. Furthermore, the IFM-based basis learning method is integrated with approximation policy iteration (API) for learning control in Markov decision problems with large state spaces. A new manifold reinforcement learning framework termed IFM-based API (IFM-API) is presented. Three learning control problems, including a real control system of the Googol single inverted pendulum, were studied to evaluate the performance of the proposed IFM-API algorithm. The simulation and experimental results show that, compared with other basis selection or learning methods, the IFM-based basis learning method can automatically compute an efficient set of basis functions with much fewer predefined parameters and less computational costs. Besides, it is illustrated that the proposed IFM-API algorithm can obtain better learning control policies than other API methods.
机译:值函数逼近(VFA)已成为强化学习中的主要研究主题。尽管已经提出了使用VFA的各种强化学习算法,但是大多数先前算法的性能取决于基本函数的预定义结构。为了解决这个问题,本文提出了一种基于等距特征映射(IFM)的VFA基础学习方法。在提出的方法中,通过在d维欧几里得空间中构建数据的最佳嵌入基础来自动生成VFA的基础函数,该函数最好地保留了流形的估计固有几何形状。此外,基于IFM的基础学习方法与近似策略迭代(API)集成在一起,用于在具有大状态空间的Markov决策问题中进行学习控制。提出了一种新的流形强化学习框架,称为基于IFM的API(IFM-API)。研究了三个学习控制问题,包括Googol单倒立摆的实际控制系统,以评估所提出的IFM-API算法的性能。仿真和实验结果表明,与其他基础选择或学习方法相比,基于IFM的基础学习方法可以自动计算一组有效的基础函数,而预定义参数要少得多,计算成本也要低得多。此外,说明了所提出的IFM-API算法比其他API方法可以获得更好的学习控制策略。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号