首页> 外文期刊>Nature >Autonomous navigation of stratospheric balloons using reinforcement learning
【24h】

Autonomous navigation of stratospheric balloons using reinforcement learning

机译:利用加固学习的平流层气球的自主导航

获取原文
获取原文并翻译 | 示例
       

摘要

Data augmentation and a self-correcting design are used to develop a reinforcement-learning algorithm for the autonomous navigation of Loon superpressure balloons in challenging stratospheric weather conditions.Efficiently navigating a superpressure balloon in the stratosphere(1) requires the integration of a multitude of cues, such as wind speed and solar elevation, and the process is complicated by forecast errors and sparse wind measurements. Coupled with the need to make decisions in real time, these factors rule out the use of conventional control techniques(2,3). Here we describe the use of reinforcement learning(4,5) to create a high-performing flight controller. Our algorithm uses data augmentation(6,7) and a self-correcting design to overcome the key technical challenge of reinforcement learning from imperfect data, which has proved to be a major obstacle to its application to physical systems(8). We deployed our controller to station Loon superpressure balloons at multiple locations across the globe, including a 39-day controlled experiment over the Pacific Ocean. Analyses show that the controller outperforms Loon's previous algorithm and is robust to the natural diversity in stratospheric winds. These results demonstrate that reinforcement learning is an effective solution to real-world autonomous control problems in which neither conventional methods nor human intervention suffice, offering clues about what may be needed to create artificially intelligent agents that continuously interact with real, dynamic environments.
机译:数据增强和自我校正设计用于开发一种挑战性地段天气条件中LOON超压气球的自主导航的增强学习算法。平流层(1)中的高压气球下降需要众多线索的整合(如风速和太阳升降),并且通过预测误差和稀疏风测量过程变得复杂。再加上实时做出决策,这些因素排除了传统控制技术(2,3)的使用。在这里,我们描述了使用加强学习(4,5)来创建高性能的飞行控制器。我们的算法使用数据增强(6,7)和自我纠正设计,克服来自不完美数据的强化学习的关键技术挑战,这已被证明是其在物理系统(8)的应用的主要障碍。我们将我们的控制器部署到全球多个地点的Loon Cutressure Balloons站,包括在太平洋的39天控制实验。分析表明,控制器优于Loon先前的算法,并且对地流层风中的自然多样性具有鲁棒性。这些结果表明,增强学习是对现实世界自主控制问题的有效解决方案,其中既不常规方法也不足够,提供关于创造与真实,动态环境连续交互的人工智能代理所需的线索。

著录项

  • 来源
    《Nature》 |2020年第7836期|77-82|共6页
  • 作者单位

    Google Res Brain Team Montreal PQ Canada;

    Loon Mountain View CA 94043 USA;

    Google Res Brain Team Montreal PQ Canada;

    Loon Mountain View CA 94043 USA;

    Google Res Brain Team Montreal PQ Canada;

    Google Res Brain Team Montreal PQ Canada;

    Loon Mountain View CA 94043 USA;

    Google Res Brain Team Toronto ON Canada;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);美国《生物学医学文摘》(MEDLINE);美国《化学文摘》(CA);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-18 22:15:36

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号