Reinforcement Learning with Time

机译：与时俱进的强化学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper steps back from the standard infinite horizon formulation of reinforcement learning problems to consider the simpler case of finite horizon probems. Although finite horizon problems may be solved using infinite horizon learning algorithms by recasting the problem as an infinite horizon problem over a state space extended to include time, we show that such an application of infinite horizon learning algorithms does not make ue of what is known about the environment structure, and is therefore inefficient. Preserving a notion of time within the environment allows us to consider extending the environment model to include, for example, random action duration. Such extentions allow us to model non-Markov environments which can be learned using reinforcement learnign algorithms.

机译：本文从强化学习问题的标准无限视界公式出发，考虑了有限视界探针的更简单情况。尽管可以使用无限视界学习算法解决无限视界问题，方法是将该问题重铸为扩展到包括时间的状态空间上的无限视界问题，但我们证明了无限视界学习算法的这种应用并没有充分了解环境结构，因此效率低下。在环境中保留时间概念使我们可以考虑扩展环境模型以包括例如随机动作持续时间。这样的扩展使我们能够对非马尔可夫环境进行建模，这可以使用强化学习算法来学习。

著录项

来源
《National conference of artificial intelligence;Innovative applications of artificial intelligence conference》|1997年|p.577-582|共6页
会议地点
作者
Daishi Harada;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动化技术、计算机技术;
关键词
入库时间 2022-08-26 14:03:38

相似文献

外文文献
中文文献
专利

1. Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning [J] . Naoto Horie, Tohgoroh Matsui, Koichi Moriyama, Artificial life and robotics . 2019,第3期

机译：多目标安全强化学习：多目标强化学习与安全强化学习之间的关系
2. Reinforcement learning for linear continuous-time systems: an incremental learning approach [J] . Tao Bian, Zhong-Ping Jiang Automatica Sinica, IEEE/CAA Journal of . 2019,第2期

机译：线性连续时间系统的强化学习：增量学习方法
3. Q-learning solution for optimal consensus control of discrete-time multiagent systems using reinforcement learning [J] . Mu Chaoxu, Zhao Qian, Gao Zhongke, Journal of the Franklin Institute . 2019,第13期

机译：使用强化学习的离散多主体系统最优共识控制的Q学习解决方案
4. Finite-Time Performance Bounds and Adaptive Learning Rate Selection for Two Time-Scale Reinforcement Learning [C] . Harsh Gupta, R. Srikant, Lei Ying Conference on Neural Information Processing Systems . 2020

机译：有限时间绩效界限和两个时间级增强学习的自适应学习率选择
5. Improving Learning and Reducing Time: A Constrained Action Based Reinforcement Learning Approach [D] . Shen, Shitian. 2019

机译：改善学习和减少时间：基于约束的加强学习方法
6. Towards sentiment aided dialogue policy learning for multi-intent conversations using hierarchical reinforcement learning [O] . Tulika Saha, Sriparna Saha, Pushpak Bhattacharyya 2020

机译：利用等级强化学习的多意图对话的情感对话策略学习
7. Relevance of working memory for reinforcement learning in older adults varies with timescale of learning [O] . Irene van de Vijver, Romain Ligneul 2019

机译：在老年人中加强学习的工作记忆的相关性因学习时间而异

Reinforcement Learning with Time

摘要

著录项

相似文献

相关主题

期刊订阅