On amount and quality of bias in reinforcement learning

机译：加固学习中偏差的数量和质量

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Reinforcement learning is widely regarded as elegant in theory but hopelessly slow in practice. This is because it is often studied under the assumption that there is little or no prior information about the task at hand. This assumption, however, is not the defining characteristic of learning. Learning involves the incorporation of prior knowledge or bias that can greatly accelerates or otherwise improves the learning process.In this paper we address the influence of the amount and quality of bias on the speed of reinforcement learning. For a chosen class of learning problem different forms of biases are initially identified. Some of the bias are extracted from the knowledgeof the environment, others from the task, and yet a few from both. Belief matrices, which reset Q-tables before learning commences, encode the biases. The average number of interactions between the agent and the environment is used to quantify the biases. Based on this performance measure, the biases are graded and some new results are reported. In addition, the paper compares continual learning to learning from scratch and presents results that clearly demonstrate the advantages of the former.

机译：强化学习在理论上被广泛认为优雅，但实际上绝望地缓慢。这是因为假设往往是几乎没有关于手头任务的事先信息的研究。然而，这种假设不是学习的定义特征。学习涉及纳入先前的知识或偏见，可以大大加速或以其他方式改善学习过程。在本文中，我们解决了偏倚速度的影响力和质量的影响。对于选择的学习问题，最初识别出不同形式的偏差。一些偏见从环境中的知识中提取，其他人来自任务，而且来自两者的几个。信仰矩阵，在学习开始之前重置Q-Tables，编码偏差。代理和环境之间的平均交互次数用于量化偏差。基于这种性能措施，偏差分级，报告了一些新的结果。此外，本文比较不断学习从划伤中学习，并提出了清楚地证明前者的优势的结果。

著录项

来源
《IEEE International Conference on System, Man, and Cybernetics》|1999年||共6页
会议地点
作者
G. Hailu; G. Sommer;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP27-53;
关键词
Reinforcement learning; Bias; Continual learning;

机译：强化学习;偏见;持续学习;

相似文献

外文文献
中文文献
专利

1. Multi-objective safe reinforcement learning: the relationship between multi-objective reinforcement learning and safe reinforcement learning [J] . Naoto Horie, Tohgoroh Matsui, Koichi Moriyama, Artificial life and robotics . 2019,第3期

机译：多目标安全强化学习：多目标强化学习与安全强化学习之间的关系
2. Overcoming model bias for robust offline deep reinforcement learning [J] . Phillip Swazinna, Steffen Udluft, Thomas Runkler Engineering Applications of Artificial Intelligence . 2021,第Sepa期

机译：克服强大的离线深度加强学习模型偏见
3. Intermittent Absence of Control during Reinforcement Learning Interferes with Pavlovian Bias in Action Selection [J] . Journal of Cognitive Neuroscience . 2020,第4期

机译：强化学习过程中间歇性缺乏控制会干扰行动选择中的巴甫洛夫偏见
4. On amount and quality of bias in reinforcement learning [C] . Hailu, G., Sommer, . 1999

机译：强化学习中偏见的数量和质量
5. Model-Based Reinforcement Learning for Cooperative Multi-Agent Planning: Exploiting Hierarchies, Bias, and Temporal Sampling [D] . Ma, Aaron. 2020

机译：基于模型的合作多智能经纪人规划的强化学习：利用层次结构，偏见和时间采样
6. Modification of a response bias through differential amount of reinforcement [O] . Charles Galloway 1967

机译：通过不同数量的增强来修改响应偏差
7. On amount and quality of bias in reinforcement learning [O] . G. Hailu, G. Sommer 1999

机译：论强化学习中偏见的数量和质量
8. Learning State Features from Policies to Bias Exploration in Reinforcement Learning [R] . Singer, B. , Veloso, M. 1999

机译：学习国家特色从政策到强化学习中的偏见探索

On amount and quality of bias in reinforcement learning

摘要

著录项

相似文献

相关主题

期刊订阅