首页> 外国专利> QMIX REINFORCEMENT LEARNING ALGORITHM-BASED SHIP WELDING SPOTS COLLABORATIVE WELDING METHOD USING MULTIPLE MANIPULATORS

QMIX REINFORCEMENT LEARNING ALGORITHM-BASED SHIP WELDING SPOTS COLLABORATIVE WELDING METHOD USING MULTIPLE MANIPULATORS

机译:基于QMIX强化学习算法的船舶焊点多机械手协同焊接方法

摘要

A qmix reinforcement learning algorithm-based ship welding spots collaborative welding method using multiple manipulators. The method comprises the following steps: a) building a reinforcement learning environment, and setting a welding area and an operation area in the environment; b) determining state values and action values of the manipulators; c) setting reward values according to the state values, the action values, and tasks of collaborative welding and collision avoidance; d) calculating a local action value function of each manipulator according to the state values and action values and by means of a recurrent neural network, and performing an action selection process; e) obtaining an overall action value function of all manipulators from the action value functions by means of a super-network set with non-negative weights; and f) constructing a loss function from the reward values of step c) and the overall action value function network of step e), calculating and updating the weights of the neural network according to a back-propagation algorithm, and repeating the training process. The method does not depend on any system model, is simple and effective, and can implement tasks of collaborative welding of welding spots in obstacle environments.
机译:基于qmix强化学习算法的船舶焊点多机械手协同焊接方法。该方法包括以下步骤:a)构建强化学习环境,并在该环境中设置焊接区域和操作区域;b) 确定操纵器的状态值和动作值;c) 根据状态值、动作值和协同焊接和防撞任务设置奖励值;d) 根据状态值和动作值并通过递归神经网络计算每个机械手的局部动作值函数,并执行动作选择过程;e) 通过具有非负权重的超级网络集,从动作值函数中获得所有机械手的整体动作值函数;和f)从步骤c)的奖励值和步骤e)的整体动作值函数网络构建损失函数,根据反向传播算法计算和更新神经网络的权重,并重复训练过程。该方法不依赖于任何系统模型,简单有效,可以实现障碍环境中焊点的协同焊接任务。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号