...
首页> 外文期刊>Engineering Applications of Artificial Intelligence >Deep replacement: Reinforcement learning based constellation management and autonomous replacement
【24h】

Deep replacement: Reinforcement learning based constellation management and autonomous replacement

机译:深层替代:基于加强学习的星座管理和自主替代

获取原文
获取原文并翻译 | 示例
           

摘要

The Deep Reinforcement Learning (DRL) algorithm, Proximal Policy Optimization (PPO2), is deployed on a custom spacecraft (S/C) build and loss model to determine if an Artificial Intelligence (AI) can learn to monitor satellite constellation health and determine an optimal replacement strategy. A custom environment is created to simulate how S/C are built, launched, generate revenue, and finally decay. The reinforcement learning agent successfully learned an optimal policy for two models: a Simplified Model where the financial cost of actions is ignored; and an Advanced Model where the financial cost of actions is a major element. In both models the AI monitors the constellations and takes multiple strategic and tactical actions to replace satellites to maintain constellation performance. The Simplified Model showed that the PPO2 algorithm was able to converge on an optimal solution after ~200,000 simulations. The Advanced Model was much more difficult for the AI to learn, and thus, the performance drops during the early episodes, but eventually converges to an optimal policy at ~25,000,000 simulations. With the Advanced Model, the AI is taking actions that are successfully providing strategies for constellation management and satellite replacements which include these actions' financial implications. Thus, the methods in this paper provide initial research developments towards a real-world tool and an AI application that can aid various Aerospace businesses in managing Low Earth Orbit (LEO) constellations. This type of AI application may become imperative for deploying and maintaining small satellite mega-constellations.
机译:深度加强学习(DRL)算法,近端策略优化(PPO2)部署在自定义航天器(S / C)构建和丢失模型上,以确定人工智能(AI)是否可以学会监控卫星星座健康并确定一个最优替代策略。创建自定义环境以模拟S / C如何构建,启动,生成收入,最终衰减。钢筋学习代理成功学习了两种型号的最佳政策:简化模型,忽略了行动的财务成本;和一个先进的模型,其中的财务成本是一个主要的元素。在两种模型中,AI监控星座,并采取多种战略和战术行动来取代卫星以维持星座性能。简化模型表明,PPO2算法能够在〜200,000模拟后收敛于最佳解决方案。高级模型对AI学习更加困难,因此,早期发作期间的性能下降,但最终会聚到约25,000,000〜25,000,000次的最佳政策。通过先进的模型,AI正在采取措施,成功为包括这些行动的财务影响的星座管理和卫星替代品的策略提供了策略。因此,本文中的方法为真实世界的工具和AI应用程序提供了初步的研究,可以帮助各种航空航天企业管理低地球轨道(LEO)星座。这种类型的AI应用程序可能会成为部署和维护小卫星巨型星座的必要性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号