首页> 外国专利> OPPONENT MODELING WITH ASYNCHRONOUS METHODS IN DEEP RL

OPPONENT MODELING WITH ASYNCHRONOUS METHODS IN DEEP RL

机译:深层RL中采用异步方法的对手建模

摘要

A computer system and method for extending parallelized asynchronous reinforcement learning to include agent modeling for training a neural network is described. Coordinated operation of plurality of hardware processors or threads is utilized such that each functions as a worker process that is configured to simultaneously interact with a target computing environment for local gradient computation based on a loss determination mechanism and to update global network parameters. The loss determination mechanism includes at least a policy loss term (actor), a value loss term (critic), and a supervised cross entropy loss. Variations are described further where the neural network is adapted to include a latent space to track agent policy features.
机译:描述了一种用于扩展并行化异步强化学习以包括用于训练神经网络的代理模型的计算机系统和方法。利用多个硬件处理器或线程的协调操作,以使每个函数充当工作进程,该工作进程被配置为与目标计算环境同时进行交互以基于损耗确定机制进行局部梯度计算并更新全局网络参数。损失确定机制至少包括策略损失项(参与者),价值损失项(评论)和监督交叉熵损失。进一步描述了变体,其中神经网络适于包括潜在空间以跟踪代理策略特征。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号