【24h】

Reinforcement Learning with Time

机译:与时俱进的强化学习

获取原文

摘要

This paper steps back from the standard infinite horizon formulation of reinforcement learning problems to consider the simpler case of finite horizon probems. Although finite horizon problems may be solved using infinite horizon learning algorithms by recasting the problem as an infinite horizon problem over a state space extended to include time, we show that such an application of infinite horizon learning algorithms does not make ue of what is known about the environment structure, and is therefore inefficient. Preserving a notion of time within the environment allows us to consider extending the environment model to include, for example, random action duration. Such extentions allow us to model non-Markov environments which can be learned using reinforcement learnign algorithms.
机译:本文从强化学习问题的标准无限视界公式出发,考虑了有限视界探针的更简单情况。尽管可以使用无限视界学习算法解决无限视界问题,方法是将该问题重铸为扩展到包括时间的状态空间上的无限视界问题,但我们证明了无限视界学习算法的这种应用并没有充分了解环境结构,因此效率低下。在环境中保留时间概念使我们可以考虑扩展环境模型以包括例如随机动作持续时间。这样的扩展使我们能够对非马尔可夫环境进行建模,这可以使用强化学习算法来学习。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号