首页> 外文期刊>Aerospace science and technology >Reinforcement learning in dual-arm trajectory planning for a free-floating space robot
【24h】

Reinforcement learning in dual-arm trajectory planning for a free-floating space robot

机译:自由浮动空间机器人双臂轨迹规划中的强化学习

获取原文
获取原文并翻译 | 示例
       

摘要

A free-floating space robot exhibits strong dynamic coupling between the arm and the base, and the resulting position of the end of the arm depends not only on the joint angles but also on the state of the base. Dynamic modeling is complicated for multiple degree of freedom (DOF) manipulators, especially for a space robot with two arms. Therefore, the trajectories are typically planned offline and tracked online. However, this approach is not suitable if the target has relative motion with respect to the servicing space robot. To handle this issue, a model-free reinforcement learning strategy is proposed for training a policy for online trajectory planning without establishing the dynamic and kinematic models of the space robot. The model-free learning algorithm learns a policy that maps states to actions via trial and error in a simulation environment. With the learned policy, which is represented by a feedforward neural network with 2 hidden layers, the space robot can schedule and perform actions quickly and can be implemented for real-time applications. The feasibility of the trained policy is demonstrated for both fixed and moving targets. (C) 2020 Elsevier Masson SAS. All rights reserved.
机译:自由漂浮的太空机器人在手臂和基座之间表现出强大的动力耦合,手臂末端的最终位置不仅取决于关节角度,还取决于基座的状态。对于多自由度(DOF)机械手,动态建模非常复杂,尤其是对于带有两个手臂的太空机器人而言。因此,轨迹通常是离线计划并在线跟踪的。但是,如果目标相对于维修空间机器人具有相对运动,则此方法不适合。为了解决这个问题,提出了一种无模型的强化学习策略,用于训练在线轨迹规划策略,而无需建立空间机器人的动态和运动学模型。无模型学习算法学习一种策略,该策略通过模拟环境中的反复试验将状态映射到动作。利用由具有2个隐藏层的前馈神经网络表示的学习策略,太空机器人可以快速调度和执行动作,并可以实现实时应用。无论是固定目标还是移动目标,都证明了该培训政策的可行性。 (C)2020年Elsevier Masson SAS。版权所有。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号