首页> 外文会议>American Control Conference;ACC '09 >Robust adaptive Markov Decision Processes in multi-vehicle applications
【24h】

Robust adaptive Markov Decision Processes in multi-vehicle applications

机译:多车辆应用中的鲁棒自适应马尔可夫决策过程

获取原文

摘要

This paper presents a new robust and adaptive framework for Markov decision processes that accounts for errors in the transition probabilities. Robust policies are typically found off-line, but can be extremely conservative when implemented in the real system. Adaptive policies, on the other hand, are specifically suited for on-line implementation, but may display undesirable transient performance as the model is updated though learning. A new method that exploits the individual strengths of the two approaches is presented in this paper. This robust and adaptive framework protects the adaptation process from exhibiting a worst-case performance during the model updating, and is shown to converge to the true, optimal value function in the limit of a large number of state transition observations. The proposed framework is investigated in simulation and actual flight experiments, and shown to improve transient behavior in the adaptation process and overall mission performance.
机译:本文为马尔科夫决策过程提出了一个新的健壮且自适应的框架,该框架解决了转移概率中的错误。健壮的策略通常是脱机的,但是在实际系统中实施时可能会非常保守。另一方面,自适应策略特别适合于在线实施,但是随着通过学习对模型的更新,自适应策略可能会显示出不良的瞬时性能。本文提出了一种利用两种方法各自优点的新方法。这种健壮且自适应的框架可保护自适应过程在模型更新期间不表现出最坏情况的性能,并在大量状态转换观测值的限制下收敛于真实的最优值函数。在仿真和实际飞行实验中对提出的框架进行了研究,结果表明该框架可以改善适应过程和整体任务性能中的瞬态行为。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号