首页> 中文期刊> 《中国通信:英文版》 >Sample-Efficient Deep Reinforcement Learning with Directed Associative Graph

Sample-Efficient Deep Reinforcement Learning with Directed Associative Graph

         

摘要

Reinforcement learning can be modeled as markov decision process mathematically.In consequence,the interaction samples as well as the connection relation between them are two main types of information for learning.However,most of recent works on deep reinforcement learning treat samples independently either in their own episode or between episodes.In this paper,in order to utilize more sample information,we propose another learning system based on directed associative graph(DAG).The DAG is built on all trajectories in real time,which includes the whole connection relation of all samples among all episodes.Through planning with directed edges on DAG,we offer another perspective to estimate stateaction pair,especially for the unknowns to deep neural network(DNN)as well as episodic memory(EM).Mixed loss function is generated by the three learning systems(DNN,EM and DAG)to improve the efficiency of the parameter update in the proposed algorithm.We show that our algorithm is significantly better than the state-of-the-art algorithm in performance and sample efficiency on testing environments.Furthermore,the convergence of our algorithm is proved in the appendix and its long-term performance as well as the effects of DAG are verified.

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号