【24h】

Batch Reinforcement Learning with State Importance

机译:具有状态重要性的批量强化学习

获取原文
获取原文并翻译 | 示例

摘要

We investigate the problem of using function approximation in reinforcement learning where the agent's policy is represented as a classifier mapping states to actions. High classification accuracy is usually deemed to correlate with high policy quality. But this is not necessarily the case as increasing classification accuracy can actually decrease the policy's quality. This phenomenon takes place when the learning process begins to focus on classifying less "important" states. In this paper, we introduce a measure of state's decision-making importance that can be used to improve policy learning. As a result, the focused learning process is shown to converge faster to better policies.
机译:我们研究了在强化学习中使用函数逼近的问题,其中代理的策略表示为将状态映射到动作的分类器。通常认为高分类精度与高策略质量相关。但这不是必然的情况,因为提高分类准确性实际上会降低策略的质量。当学习过程开始专注于对不太重要的状态进行分类时,就会发生这种现象。在本文中,我们介绍了一种衡量国家决策重要性的方法,可以用来改善政策学习。结果,集中学习过程显示出更快地收敛到更好的策略。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号