Improving Reinforcement Learning Rates Using Prior Knowledge

机译：利用先验知识提高强化学习率

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a new technique called Prior Policy Fallback (PPF) which provides a simple means of incorporating prior knowledge into a reinforcement learning system based on Q-learning. PPF performs significantly better than an intuitively similar method based on selectively initializing values of the Q table. The benefits of PPF occur because PPF does not adversely interact with the normal Q-learning update mechanisms. PPF simply accelerates the time to reward during initial trials and then gradually becomes less involved as normal Q-learning updates occur. PPF provides an alternative to manual teaching methods for accelerating learning during early trials and can be used in conjunction with many of the existing methods for accelerating Q-learning updates. The benefits of PPF are illustrated through several experiments based on a static grid-based world.

著录项

作者
Swanson, Keith J.;
展开▼
作者单位

展开▼
年度 1993
页码 1-18
总页数 18
原文格式 PDF
正文语种 eng
中图分类工业技术;
关键词
MACHINE LEARNING; LEARNING THEORY; TIME LAG; SELF ORGANIZING SYSTEMS; RATES (PER TIME); HEURISTIC METHODS;

机译：机器学习;学习理论;时间滞后;自组织系统;费率（每次）;启发式方法;
入库时间 2022-08-29 10:58:36

相似文献

外文文献
中文文献
专利

1. The influences of self-regulated learning support and prior knowledge on improving learning performance [J] . Yang Tzu-Chi, Chen Meng Chang, Chen Sherry Y. Computers & education . 2018,第NOVa期

机译：自我调节的学习支持和先验知识对提高学习成绩的影响
2. The influences of self-regulated learning support and prior knowledge on improving learning performance [J] . Yang Tzu-Chi, Chen Meng Chang, Chen Sherry Y. Computers & education . 2018,第Nova期

机译：自我调节的学习支持和先验知识对提高学习成绩的影响
3. Can the use of cognitive and metacognitive self-regulated learning strategies be predicted by learners' levels of prior knowledge in hypermedia-learning environments? [J] . Michelle Taub, Roger Azevedo, Francois Bouchet, Computers in Human Behavior . 2014,第octa期

机译：在超媒体学习环境中，学习者的先验知识水平可以预测认知和元认知自我调节学习策略的使用吗？
4. Interactive Reinforcement Learning with Dynamic Reuse of Prior Knowledge from Human and Agent Demonstrations [C] . Zhaodong Wang, Matthew E. Taylor International Joint Conference on Artificial Intelligence . 2020

机译：互动强化学习，具有人力和代理示范现有知识的动态重用
5. The effects of different scaffolding strategies, prior knowledge, computer attitudes, and expertise reversal effect on learning outcomes in a cognitive apprenticeship learning environment. [D] . Schwarz, Marc S. 2003

机译：在认知学徒学习环境中，不同的脚手架策略，先验知识，计算机态度和专业知识逆转对学习结果的影响。
6. Optimizing the Sensor Placement for Foot Plantar Center of Pressure without Prior Knowledge Using Deep Reinforcement Learning [O] . Cheng-Wu Lin, Shanq-Jang Ruan, Wei-Chun Hsu, 2020

机译：使用深度加强学习优化脚跖压力压力中心的传感器放置
7. Optimizing the Sensor Placement for Foot Plantar Center of Pressure without Prior Knowledge Using Deep Reinforcement Learning [O] . Cheng-Wu Lin, Shanq-Jang Ruan, Wei-Chun Hsu, 2020

机译：使用深度加强学习，优化脚跖压力压力中心的传感器放置

Improving Reinforcement Learning Rates Using Prior Knowledge

摘要

著录项

相似文献

相关主题

期刊订阅