...
首页> 外文期刊>Proceedings of the Institution of Mechanical Engineers >Adaptive estimator design for unstable output error systems: A test problem and traditional system identification based analysis
【24h】

Adaptive estimator design for unstable output error systems: A test problem and traditional system identification based analysis

机译:不稳定输出误差系统的自适应估计器设计:一个测试问题和基于传统系统识别的分析

获取原文
获取原文并翻译 | 示例
           

摘要

A key open question in adaptive estimator design is how to assure that the parameters of the proposed algorithms are converging to their almost correct solutions; hence, the learning algorithm is unbiased. Moreover, determining the speed of parameter convergence is important as it provides insight about the performance of the learning algorithms. The main contributions of the article are fourfold: the first one is that the article, initially, introduces an adaptive estimator to learn the discounted Q-function and approximate optimal control policy without requiring linear, discrete time, unstable output error system dynamics, but using only the noisy system measurements. The simulation results show that the adaptive estimator minimizes the stochastic cost function and temporal difference error and also learns the approximate Q-function together with the control policy. The second one is consideration of a different approach by taking a simple test problem to investigate issues associated with the Q-function's representation and parametric convergence. In particular, the terminal convergence problem is analyzed with a known optimal control policy where the aim is to accurately learn only the Q-function. It is parameterized by terms which are functions of the unknown plant's parameters and the Q-function's discount factor, and their convergence properties are analyzed and compared with the adaptive estimator. The third one is to show that even though the adaptive estimator with a large Q-function discount factor yields larger control feedback gains, so that faster state converges upright, the learning problem is badly conditioned; hence, the parameter convergence is sluggish, as the Q-function discount factor approaches the inverse of the dominant pole of the unstable system. Finally, the fourth one is comparison of the state output learned by the adaptive estimator with the ones obtained from traditional system identification algorithms. Simulation result for a higher order unstable output error system shows that the adaptive estimator closely follows the real system output whereas the system identification algorithms do not.
机译:自适应估计器设计中的一个关键开放问题是如何确保所提出算法的参数收敛到其几乎正确的解。因此,学习算法是无偏的。此外,确定参数收敛的速度很重要,因为它可以提供有关学习算法性能的见解。本文的主要贡献有四个方面:第一个是该文章最初介绍了一种自适应估计器,用于学习折现Q函数和近似最优控制策略,而无需线性,离散时间,不稳定的输出误差系统动力学,但是可以使用仅嘈杂的系统测量。仿真结果表明,自适应估计器将随机成本函数和时间差误差降到了最小,并与控制策略一起学习了近似Q函数。第二个问题是通过采取简单的测试问题来研究与Q函数表示和参数收敛有关的问题,从而考虑不同的方法。特别地,使用已知的最佳控制策略来分析终端收敛问题,其目的是仅精确学习Q函数。它由作为未知工厂参数和Q函数的折现因子的函数的项进行参数化,并对其收敛性进行分析并与自适应估计器进行比较。第三个结果表明,即使具有较大Q函数折扣因子的自适应估计器产生较大的控制反馈增益,以至于更快的状态直立收敛,学习问题也会受到严重限制。因此,由于Q函数折现因子接近不稳定系统的主导极点的倒数,因此参数收敛缓慢。最后,第四点是将自适应估计器学习的状态输出与从传统系统识别算法获得的状态输出进行比较。高阶不稳定输出误差系统的仿真结果表明,自适应估计器紧随实际系统输出,而系统识别算法则不然。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号