首页> 外文会议>Chinese Control Conference >Potential based policy gradient approach for optimal control of the stochastic system with unknown noise
【24h】

Potential based policy gradient approach for optimal control of the stochastic system with unknown noise

机译:基于潜在的基于政策梯度方法,用于具有未知噪声的随机系统

获取原文

摘要

This paper considers optimal control problem of the discrete-time stochastic system, where the state space is continuous and the probability property of stochastic noise is unknown. First, the considered optimal control problem is transformed into a Markov Decision Process. Then, the performance potential based performance derivative formula can be applied for estimating the performance derivative with respect to the control parameters, which is the key of the policy gradient approach of this paper. For estimating the state transition probability density function (PDF) and the potential function, the RBF neural network is applied. With kn-Nearest Neighbor techniques, the sample pairs for training the RBF neural networks can be collected from a sample path, so that the policy gradient approach can be implemented on-line for practical application. The simulation shows the effectiveness of the proposed approach.
机译:本文考虑了离散时间随机系统的最佳控制问题,其中状态空间是连续的,随机噪声的概率特性未知。首先,被认为的最佳控制问题变为马尔可夫决策过程。然后,可以应用基于性能的潜在的性能导数公式来估计相对于控制参数的性能导数,这是本文的策略梯度方法的关键。为了估计状态转换概率密度函数(PDF)和潜在功能,应用RBF神经网络。利用kN最近的邻近技术,可以从样本路径收集用于训练RBF神经网络的样本对,从而可以在线进行实际应用在线实现策略梯度方法。模拟显示了所提出的方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号