首页> 外文会议>American Control Conference >Reinforcement learning with supervision by combining multiple learnings and expert advices

【24h】

Reinforcement learning with supervision by combining multiple learnings and expert advices

机译：通过结合多种学习和专家建议，加强监督

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this paper, we provide a formal coherent learning framework where reinforcement learning is combined with multiple learnings and expert advices toward accelerating convergence speed of learning. Our approach is simply to use a nonstationary "potential-based reinforcement function" for shaping the reinforcement signal given to the learning "base-agent". The base-agent employes SARSA(O) or adaptive asynchronous value iteration (VI), and the supervised inputs to the base-agent from the "subagents" involved with other parallel independent reinforcement learnings and if available, from experts are "merged" into the potential-based reinforcement function value and the value is put into the update equation of SARSA(O) for the Q-function estimate or of adaptive asynchronous VI for the optimal value function estimate. The resulting SARSA(O) and adaptive asynchronous VI converge to an optimal policy, respectively.

机译：在本文中，我们提供了一个正式的一致学习框架，加强学习与多个学习和专家建议相结合，旨在加速学习融合速度。我们的方法只是利用非间断的“基于潜在的加强功能”来塑造给予学习“基础代理”的加强信号。基础代理商使用Sarsa（O）或自适应异步值迭代（VI），以及来自其他平行独立的强化学习的“子代理”的受监管输入，从专家们将“合并”进入基于潜在的增强函数值和该值被放入Sarsa（O）的更新方程，了解Q函数估计或自适应异步VI，以获得最佳值函数估计。由此产生的SARSA（O）和自适应异步VI分别收敛到最佳策略。

著录项

来源
《American Control Conference》|2006年||共6页
会议地点
作者
Hyeong Soo Chang;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP273-53;
关键词
learning (artificial intelligence); software agents; Q-function estimate; adaptive asynchronous value iteration; expert advices; learning base-agent; multiple learnings; optimal value function estimate; parallel independent reinforcement learning; potential-based;

机译：学习（人工智能）;软件代理;Q函数估计;自适应异步值迭代;专家建议;学习基础代理;多个学习;最优价值函数估计;并行独立的强化学习;基于潜在的独立增强学习;

相似文献

外文文献
中文文献
专利

1. Combining multiple expert annotations using semi-supervised learning and graph cuts for medical image segmentation [J] . Dwarikanath Mahapatra Computer vision and image understanding . 2016,第Octa期

机译：结合使用半监督学习和图割的多个专家注释进行医学图像分割
2. Dealing with multiple experts and non-stationarity in inverse reinforcement learning: an application to real-life problems [J] . Likmeta Amarildo, Metelli Alberto Maria, Ramponi Giorgia, Machine Learning . 2021,第9期

机译：在反增强学习中处理多个专家和非公平性：对现实生活问题的应用
3. Reinforcement learning in learning automata and cellular learning automata via multiple reinforcement signals [J] . Vafashoar Reza, Meybodi Mohammad Reza Knowledge-Based Systems . 2019,第APRa1期

机译：通过多个增强信号学习自动机和细胞学习自动机中的增强学习
4. Reinforcement learning with supervision by combining multiple learnings and expert advices [C] . Hyeong Soo Chang American Control Conference . 2006

机译：通过结合多种学习和专家建议，加强监督
5. Policy advice, non-convex and distributed optimization in reinforcement learning [D] . Zhan, Yusen. 2016

机译：强化学习中的政策建议，非凸和分布式优化
6. Learning to become an expert: reinforcement learning and the acquisition of perceptual expertise [O] . Shruti Baijal 2011

机译：学会成为专家：加强学习和掌握感知能力
7. Combining self-organizing maps with mixtures of experts: Application to an Actor-critic model of reinforcement learning in the Basal Ganglia [O] . Khamassi, Mehdi, Martinet, Louis-Emmanuel, Guillot, Agnés 2006

机译：将自组织地图与专家混合在一起：在基础神经节的强化学习的Actor-批评模型中的应用

Reinforcement learning with supervision by combining multiple learnings and expert advices

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅