首页> 外国专利> Distributed training using policy off actor explicit reinforcement learning

Distributed training using policy off actor explicit reinforcement learning

机译:分布式培训使用政策OFF演员显式强化学习

摘要

A method and system comprising a computer program encoded on a computer storage medium for training an action selective neural network that is used to select an action performed by interacting with an environment And apparatus.In one embodiment, the system includes a plurality of actor computing units and a plurality of Larner computing units.The actor computing unit uses the enhanced learning technique to generate the trajectory of the experience tuple used by the Larner computing unit to update the parameters of the Larner action selective neural network.Reinforcement learning technology may be a policy offensive reinforcement learning technique.
机译:一种方法和系统,包括在计算机存储介质上编码的计算机程序,用于训练用于训练通过与环境和装置进行交互来选择执行的动作的动作选择性神经网络。在一个实施例中,该系统包括多个actor计算单元和多个LARNER计算单元。演员计算单元使用增强的学习技术来生成LARNER计算单元使用的体验元组的轨迹来更新LARNER动作选择性神经网络的参数。重新实施学习技术可能是一种策略进攻强化学习技术。

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号