首页> 外文会议>AISB Convention >Learning by Observation: Comparison of Three Methods of Embedding Mentor's Knowledge in Reinforcement Learning Algorithms

【24h】

Learning by Observation: Comparison of Three Methods of Embedding Mentor's Knowledge in Reinforcement Learning Algorithms

机译：观察学习：三种嵌入导师知识中的三种方法比较钢筋学习算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Using knowledge of already successfully functioning agents can help to avoid expensive exploration that is so vital to some domains of reinforcement learning. Three methods to embed mentor's knowledge are proposed: initialization of Q function, reward shaping and implementing mentor's decisions in a separate action-value function. The speed of convergence of these methods in combination with Q-learning algorithm with different amount of information on mentor's decisions and their robustness to the quality of mentor are compared on four domains from the benchmarks for testing and comparing reinforcement learning algorithms "Reinforcement Learning Benchmarks and Bake-offs".

机译：利用已经成功的运作代理商的知识可以帮助避免昂贵的探索，这对一些加强学习领域这么至关重要。提出了三种嵌入了导师知识的方法：初始化Q函数，奖励塑造和在单独的动作值函数中实现导师的决策。这些方法的收敛速度与具有不同信息的Q学习算法以及对导师的质量不同的Q学习算法以及对导师的质量的鲁棒性进行了比较，用于测试和比较强化学习算法“加强学习基准和钢筋烘焙“。

著录项

来源
《AISB Convention》|2007年||共9页
会议地点
作者
Natalia Akchurina;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词

相似文献

外文文献
中文文献
专利

1. Hybrid algorithms based on combining reinforcement learning and metaheuristic methods to solve global optimization problems [J] . Seyyedabbasi Amir, Aliyev Royal, Kiani Farzad, Knowledge-Based Systems . 2021,第Jula8期

机译：基于组合钢筋学习的混合算法及弥补全球优化问题的综合算法
2. Empirical evaluation methods for multiobjective reinforcement learning algorithms [J] . Peter Vamplew, Richard Dazeley, Adam Berry, Machine Learning . 2011,第1a2期

机译：多目标强化学习算法的经验评估方法
3. Embedding fuzzy mechanisms and knowledge in box-type reinforcement learning controllers [J] . Shun-Feng Su, Sheng-Hsiung Hsieh IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics . 2002,第5期

机译：在框式强化学习控制器中嵌入模糊机制和知识
4. Learning by Observation: Comparison of Three Methods of Embedding Mentor's Knowledge in Reinforcement Learning Algorithms [C] . Natalia Akchurina AISB Convention . 2007

机译：观察学习：三种嵌入导师知识中的三种方法比较钢筋学习算法
5. The effects of observation of learn units during reinforcement and correction conditions on the rate of learning math algorithms by fifth grade students. [D] . Neu, Jessica Adele. 2013

机译：强化和修正条件下学习单元的观察对五年级学生学习数学算法的速度的影响。
6. Accurate Learning with Few Atlases (ALFA): an algorithm for MRI neonatal brain extraction and comparison with 11 publicly available methods [O] . Ahmed Serag, Manuel Blesa, Emma J. Moore, -1

机译：很少有图集即可准确学习（ALFA）：一种用于MRI新生儿脑部提取的算法并与11种公开方法进行比较
7. Methods and Algorithms for Knowledge Reuse in Multiagent Reinforcement Learning [O] . Felipe Leno Da Silva, Anna Helena Reali Costa 2020

机译：多读强度学习中知识重用的方法和算法

Learning by Observation: Comparison of Three Methods of Embedding Mentor's Knowledge in Reinforcement Learning Algorithms

摘要

著录项

相似文献

相关主题

期刊订阅