...
首页> 外文期刊>Journal of scientific computing >A Sojourn-Based Approach to Semi-Markov Reinforcement Learning
【24h】

A Sojourn-Based Approach to Semi-Markov Reinforcement Learning

机译:A Sojourn-Based Approach to Semi-Markov Reinforcement Learning

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

Abstract In this paper we introduce a new approach to discrete-time semi-Markov decision processes based on the sojourn time process. Different characterizations of discrete-time semi-Markov processes are exploited and decision processes are constructed by their means. With this new approach, the agent is allowed to consider different actions depending also on the sojourn time of the process in the current state. A numerical method based on Q-learning algorithms for finite horizon reinforcement learning and stochastic recursive relations is investigated. Finally, we consider two toy examples: one in which the reward depends on the sojourn-time, according to the gambler’s fallacy; the other in which the environment is semi-Markov even if the reward function does not depend on the sojourn time. These are used to carry on some numerical evaluations on the previously presented Q-learning algorithm and on a different naive method based on deep reinforcement learning.

著录项

获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号