Acceleration of Reinforcement Learning by Controlled Use of Options Given as Prior Information

Kento TERASHIMA; Hirotaka TAKANO; Junichi MURATA

首页> 外文期刊>SICE Journal of Control, Measurement, and System Integration (SICE JCMSI) >Acceleration of Reinforcement Learning by Controlled Use of Options Given as Prior Information

【24h】

Acceleration of Reinforcement Learning by Controlled Use of Options Given as Prior Information

机译：通过控制使用作为先验信息给出的选项来加速强化学习

获取原文

获取原文并翻译 | 示例

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Reinforcement learning is a method with which an agent learns an appropriate action policy for solving problems by the trial-and-error. The advantage is that reinforcement learning can be applied to unknown or uncertain problems. But instead, there is a drawback that this method needs a long time to solve the problem because of the trial-and-error. If there is prior information about the environment, some of trial-and-error can be spared and the learning can take a shorter time. The prior information can be provided in the form of options by a human designer. But the options can be wrong because of uncertainties in the problems. If the wrong options are used, there can be bad effects such as failure to get the optimal policy and slowing down of reinforcement learning. This paper proposes to control use of the options to suppress the bad effects. The agent forgets the given options gradually while it learns the better policy. The proposed method is applied to three testbed environments and two types of prior information. The method shows good results in terms of both the learning speed and the quality of obtained policies.

机译：强化学习是一种方法，代理可以通过这种方法学习适当的行动策略，以通过试错法解决问题。优点是可以将强化学习应用于未知或不确定的问题。但是相反，存在一个缺点，即由于反复试验，该方法需要很长时间才能解决问题。如果有关于环境的事先信息，则可以省去一些反复试验，并且学习可以花费更短的时间。先验信息可以由人类设计者以选项的形式提供。但是由于问题的不确定性，这些选择可能是错误的。如果使用了错误的选项，则可能会产生不良影响，例如无法获得最佳策略并减慢强化学习的速度。本文建议控制选项的使用以抑制不良影响。代理在学习更好的策略时会逐渐忘记给定的选项。所提出的方法被应用于三个测试平台环境和两种类型的先验信息。该方法在学习速度和所获得策略的质量方面均显示出良好的结果。

著录项

来源
《SICE Journal of Control, Measurement, and System Integration (SICE JCMSI)》 |2013年第4期|共7页
作者
Kento TERASHIMA; Hirotaka TAKANO; Junichi MURATA;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类计量学;
关键词
prior information; forgetting factor; option; reinforcement learning; exploring visit;

机译：先验信息;遗忘因素;选择;强化学习;探索探访;

相似文献

外文文献
中文文献
专利

1. Acceleration of Reinforcement Learning by Controlled Use of Options Given as Prior Information [J] . Kento TERASHIMA, Hirotaka TAKANO, Junichi MURATA SICE Journal of Control, Measurement, and System Integration (SICE JCMSI) . 2013,第4期

机译：通过控制使用作为先验信息给出的选项来加速强化学习
2. Convergence of reinforcement learning algorithms and acceleration of learning - art. no. 026706 [J] . Potapov A., Ali MK. Physical review, E. Statistical physics, plasmas, fluids, and related interdisciplinary topics . 2003,第2aPta2期

机译：强化学习算法的融合和学习加速。没有。 026706
3. Acceleration of game learning with prediction-based reinforcement learning - toward the emergence of planning behavior [J] . Yu Ohigashi, Takashi Omori, Koji Morikawa, 電子情報通信学会技術研究報告. ニュ-ロコンピュ-ティング. Neurocomputing . 2002,第627期

机译：通过基于预测的强化学习来加速游戏学习-朝计划行为的方向发展
4. A study on use of prior information for acceleration of reinforcement learning [C] . Terashima Kento, Murata Junichi SICE Annual Conference 2011 : Final program and abstracts . 2011

机译：利用先验信息促进强化学习的研究
5. Effects Of Dopamine Antagonists On Gambling Reinforcement And The Impact Of Prior Exposure In Pathological Gamblers And Controls [D] . Smart, Kelly 2013

机译：多巴胺拮抗剂对赌博增强的影响以及病态赌徒和对照中事前暴露的影响
6. Optimizing the Sensor Placement for Foot Plantar Center of Pressure without Prior Knowledge Using Deep Reinforcement Learning [O] . Cheng-Wu Lin, Shanq-Jang Ruan, Wei-Chun Hsu, 2020

机译：使用深度加强学习优化脚跖压力压力中心的传感器放置
7. Federated Reinforcement Learning Acceleration Method for Precise Control of Multiple Devices [O] . Hyun-Kyo Lim, Ju-Bong Kim, Ihsan Ullah, 2021

机译：联邦强化学习加速方法，用于多台设备的精确控制

Acceleration of Reinforcement Learning by Controlled Use of Options Given as Prior Information

摘要

著录项

相似文献

相关主题

期刊订阅