首页> 外文会议>Chinese Control Conference >Potential based policy gradient approach for optimal control of the stochastic system with unknown noise
【24h】

Potential based policy gradient approach for optimal control of the stochastic system with unknown noise

机译:基于势能的策略梯度方法用于未知噪声随机系统的最优控制

获取原文

摘要

This paper considers optimal control problem of the discrete-time stochastic system, where the state space is continuous and the probability property of stochastic noise is unknown. First, the considered optimal control problem is transformed into a Markov Decision Process. Then, the performance potential based performance derivative formula can be applied for estimating the performance derivative with respect to the control parameters, which is the key of the policy gradient approach of this paper. For estimating the state transition probability density function (PDF) and the potential function, the RBF neural network is applied. With kn-Nearest Neighbor techniques, the sample pairs for training the RBF neural networks can be collected from a sample path, so that the policy gradient approach can be implemented on-line for practical application. The simulation shows the effectiveness of the proposed approach.
机译:考虑状态空间是连续的且随机噪声的概率性质未知的离散随机系统的最优控制问题。首先,将考虑的最优控制问题转化为马尔可夫决策过程。然后,基于性能潜能的性能导数公式可以用于估计相对于控制参数的性能导数,这是本文策略梯度方法的关键。为了估计状态转移概率密度函数(PDF)和势函数,应用了RBF神经网络。使用kn-Nearest Neighbor技术,可以从样本路径中收集用于训练RBF神经网络的样本对,从而可以在线实施策略梯度方法以进行实际应用。仿真表明了该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号