...
首页> 外文期刊>IEEE Control Systems Letters >Multi-Robot Guided Policy Search for Learning Decentralized Swarm Control
【24h】

Multi-Robot Guided Policy Search for Learning Decentralized Swarm Control

机译:多机器人引导政策搜索学习分散的群体控制

获取原文
获取原文并翻译 | 示例
           

摘要

Multi-robot learning has been extensively studied recently. Developing provably-correct algorithms for learning decentralized control policies remains challenging. In this letter, we propose a sample-efficient multi-robot learning method based on guided policy search to learn decentralized swarm control policies. The proposed method uses distributed trajectory optimization to provide guiding trajectory samples for policy training. In turn, the learned policy is exploited to update the trajectory optimization results so that the guiding trajectories are reproducible by the current policy. A learning algorithm is designed to alternate between distributed trajectory optimization and policy optimization, which eventually converges to a solution with good long-term performance. We demonstrate the effectiveness of our method in a multi-robot rendezvous problem. The simulation results in a robot simulator show that our method efficiently learn decentralized control policy with substantially less training samples.
机译:最近多机器人学习得到了广泛的研究。为学习分散控制政策的显着校正正确的算法仍然具有挑战性。在这封信中,我们提出了一种基于指导策略搜索的采样有效的多机器人学习方法,以学习分散的群体控制策略。该方法使用分布式轨迹优化来为政策培训提供指导轨迹样本。反过来,利用学习的策略来更新轨迹优化结果,使引导轨迹由当前策略再现。学习算法旨在在分布式轨迹优化和策略优化之间交替,最终会聚到具有良好长期性能的解决方案。我们展示了我们在多机器人的聚会问题中的方法的有效性。仿真结果在机器人模拟器中,我们的方法有效地学习分散的控制政策,其培训样本大大。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号