首页> 外文会议>IFAC Symposium on Automatic Control in Aerospace >Particle Guidance: Applying POMDPs to the Optimization of Mid-Course Guidance Laws for Long-Range Missiles
【24h】

Particle Guidance: Applying POMDPs to the Optimization of Mid-Course Guidance Laws for Long-Range Missiles

机译:粒子指导:将POMDP应用于远程导弹的中间课程指导法的优化

获取原文

摘要

During the mid-course phase of an air-to-air missile, choosing the optimal Guidance Point (GP) so as to maximize lock-on success and minimize intercept time is critical. Given low computational resources available on board and a very constrained maneuvering time frame, GP-based algorithms must be efficient. We suggest an innovative approach using Reinforcement Learning (RL) to produce finite state controllers that can be executed efficiently - using table lookup - to meet the strict time limits of a target engagement. Instead of hand-crafting a GP-picking algorithm for every combination of sensor and aircraft configuration, one promising alternative models a missile-target engagement as a Partially Observable Markov Decision Process (POMDP) and automatically generates a controller for picking the best GP by solving the POMDP model. Using a recently developed offline algorithm called Monte Carlo Value Iteration (MCVI) we constructed continuous-state POMDP models and solved them directly, without discretizing the entire state space.
机译:在空对空导弹的中间阶段,选择最佳指导点(GP),以最大化锁定成功并最小化拦截时间至关重要。考虑到船上可用的低计算资源和一个非常受限制的机动时间帧,基于GP的算法必须有效。我们建议使用加强学习(RL)的创新方法来生产可以有效地执行的有限状态控制器 - 使用表查找 - 以满足目标参与的严格时间限制。一个有前途的替代模型作为部分观察到的马尔可夫决策过程(POMDP),而不是手工制作一个传感器和飞机配置的GP拣选算法,而不是用于每个传感器和飞机配置的GP拣选算法POMDP模型。使用最近开发的离线算法称为Monte Carlo值迭代(MCVI)我们构建了连续状态POMDP模型,并直接解决了它们,而无需离散状态空间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号