Asymptotic Learnability of Reinforcement Problems with Arbitrary Dependence

机译：任意依赖性加强问题的渐近学报

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We address the problem of reinforcement learning in which observations may exhibit an arbitrary form of stochastic dependence on past observations and actions, i.e. environments more general than (PO) MDPs. The task for an agent is to attain the best possible asymptotic reward where the true generating environment is unknown but belongs to a known countable family of environments. We find some sufficient conditions on the class of environments under which an agent exists which attains the best asymptotic reward for any environment in the class. We analyze how tight these conditions are and how they relate to different probabilistic assumptions known in reinforcement learning and related fields, such as Markov Decision Processes and mixing conditions.

机译：我们解决了强化学习的问题，其中观察可能表现出对过去观察和行动的任意形式的随机依赖性，即环境比（PO）MDP更通用。代理的任务是达到真正生成环境未知的最佳可能的渐近奖励，但属于已知的可数环境。我们在存在的环境中找到了一些足够的条件，该环境的存在性存在于课堂上任何环境的最佳渐近奖励。我们分析了这些条件如何以及如何与强化学习和相关领域中已知的不同概率假设相关，例如马尔可夫决策过程和混合条件。

著录项

来源
《International Conference on Algorithmic Learning Theory》|2006年||共14页
会议地点
作者
Daniil Ryabko; Marcus Hutter;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类人工智能理论;
关键词

相似文献

外文文献
中文文献
专利

1. Distributed dual vigilance fuzzy adaptive resonance theory learns online, retrieves arbitrarily-shaped clusters, and mitigates order dependence [J] . Brito da Silva Leonardo Enzo, Elnabarawy Islam, Wunsch Donald C. II Neural Networks: The Official Journal of the International Neural Network Society . 2020,第期

机译：分布式双重警惕模糊自适应共振理论在线学习，检索任意形状的集群，并减轻秩序依赖性
2. Fast diffeomorphic matching to learn globally asymptotically stable nonlinear dynamical systems [J] . Perrin Nicolas, Schlehuber-Caissier Philipp Systems and Control Letters . 2016,第Null期

机译：快速微分匹配，以学习全局渐近稳定的非线性动力系统
3. A general low-order partial regularity theory for asymptotically convex functionals with asymptotic dependence on the minimizer [J] . Goodrich Christopher S. Calculus of variations and partial differential equations . 2018,第6期

机译：渐近凸面与渐近依赖性对最小化依赖的渐近突出功能的一般低阶部分规律理论
4. Asymptotic Learnability of Reinforcement Problems with Arbitrary Dependence [C] . Daniil Ryabko, Marcus Hutter Algorithmic Learning Theory; Lecture Notes in Artificial Intelligence; 4264 . 2006

机译：具有任意依赖性的钢筋问题的渐近可学习性
5. Asymptotic fields for cracks terminating at bi-material interface with arbitrary angles [D] . Liu, Xiao. 2015

机译：裂纹在任意角度双材料界面处终止的渐近场
6. Locally asymptotically rank-based procedures for testing autoregressive moving average dependence [O] . Marc Hallin, Madan L. Puri 1988

机译：基于局部渐近等级的过程用于测试自回归移动平均依赖
7. Asymptotic learnability of reinforcement problems with arbitrary dependence [O] . Daniil Ryabko, Marcus Hutter 2006

机译：具有任意依赖性的补强问题的渐近可学习性

Asymptotic Learnability of Reinforcement Problems with Arbitrary Dependence

摘要

著录项

相似文献

相关主题

期刊订阅