首页> 外文期刊>IET Cyber-Physical Systems: Theory & Applications >Estimation and control using sampling-based Bayesian reinforcement learning
【24h】

Estimation and control using sampling-based Bayesian reinforcement learning

机译:基于抽样的贝叶斯强化学习估算和控制

获取原文
获取原文并翻译 | 示例
           

摘要

Real-world autonomous systems operate under uncertainty about both their pose and dynamics. Autonomous control systems must simultaneously perform estimation and control tasks to maintain robustness to changing dynamics or modelling errors. However, information gathering actions often conflict with optimal actions for reaching control objectives, requiring a trade-off between exploration and exploitation. The specific problem setting considered here is for discrete-time non-linear systems, with process noise, input-constraints, and parameter uncertainty. This study frames this problem as a Bayes-adaptive Markov decision process and solves it online using Monte Carlo tree search with an unscented Kalman filter to account for process noise and parameter uncertainty. This method is compared with certainty equivalent model predictive control and a tree search method that approximates the QMDP solution, providing insight into when information gathering is useful. Discrete time simulations characterise performance over a range of process noise and bounds on unknown parameters. An offline optimisation method is used to select the Monte Carlo tree search parameters without hand-tuning. In lieu of recursive feasibility guarantees, a probabilistic bounding heuristic is offered that increases the probability of keeping the state within a desired region.
机译:现实世界自治系统在不确定的情况下,他们的姿势和动态都运作。自主控制系统必须同时执行估计和控制任务,以保持更改动态或建模错误的鲁棒性。但是,信息收集行动通常与达到控制目标的最佳行动冲突,要求在勘探和剥削之间进行权衡。这里考虑的具体问题设置用于离散时间非线性系统,具有过程噪声,输入限制和参数不确定性。本研究将此问题框架作为贝叶斯自适应马尔可夫决策过程,并使用Monte Carlo树搜索在线解决了与Unstented Kalman滤波器进行了解决,以考虑过程噪声和参数不确定性。将该方法与确定性等效模型预测控制和树搜索方法进行比较,近似于QMDP解决方案,从信息收集有用时提供深入了解。离散时间模拟在一系列过程噪声和未知参数上的界限上表征性能。离线优化方法用于选择没有手动调整的蒙特卡罗树搜索参数。代替递归可行性保证,提供了一种概率的边界启发式,从而提高了保持状态在所需区域内的可能性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号