Batch Reinforcement Learning with State Importance

机译：具有状态重要性的批量强化学习

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

We investigate the problem of using function approximation in reinforcement learning where the agent's policy is represented as a classifier mapping states to actions. High classification accuracy is usually deemed to correlate with high policy quality. But this is not necessarily the case as increasing classification accuracy can actually decrease the policy's quality. This phenomenon takes place when the learning process begins to focus on classifying less "important" states. In this paper, we introduce a measure of state's decision-making importance that can be used to improve policy learning. As a result, the focused learning process is shown to converge faster to better policies.

机译：我们研究了在强化学习中使用函数逼近的问题，其中代理的策略表示为将状态映射到动作的分类器。通常认为高分类精度与高策略质量相关。但这不是必然的情况，因为提高分类准确性实际上会降低策略的质量。当学习过程开始专注于对不太重要的状态进行分类时，就会发生这种现象。在本文中，我们介绍了一种衡量国家决策重要性的方法，可以用来改善政策学习。结果，集中学习过程显示出更快地收敛到更好的策略。

著录项

来源
《European Conference on Machine Learning(ECML 2004); 20040920-24; Pisa(IT)》|2004年|P.566-568|共3页
会议地点 Pisa(IT)
作者
Lihong Li; Vadim Bulitko; Russell Greiner;
展开▼
作者单位

University of Alberta, Department of Computing Science, Edmonton, Alberta, Canada T6G 2E8;

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动推理、机器学习;
关键词
入库时间 2022-08-26 14:09:15

相似文献

外文文献
中文文献
专利

1. Batch Reinforcement Learning for Robotic Soccer Using the Q-Batch Update-Rule [J] . Cunha Joao, Serra Rui, Lau Nuno, Journal of Intelligent & Robotic Systems: Theory & Application . 2015,第3a4期

机译：使用Q批次更新规则对机器人足球进行批次强化学习
2. Machine-learning-based simulation and fed-batch control of cyanobacterial-phycocyanin production in Plectonema by artificial neural network and deep reinforcement learning [J] . Yan Ma, Daniel A. Norena-Caro, Alexandria J. Adams, Computers & Chemical Engineering . 2020,第Nova2期

机译：通过人工神经网络和深加固学习在Plectonema中基于机器学习的仿真和喂养分批控制植物植物植物植物
3. Reinforcement learning based optimal control of batch processes using Monte-Carlo deep deterministic policy gradient with phase segmentation [J] . Haeun Yoo, Boeun Kim, Jong Woo Kim, Computers & Chemical Engineering . 2021,第Jana4期

机译：基于跨越蒙特 - 卡洛深度确定性政策梯度的批量学习基于批处理流程的最优控制
4. Reinforcement Learning for Batch-to-Batch Bioprocess Optimisation [C] . P.Petsagkourakis, I.Orson Sandoval, E.Bradford European Symposium on Computer Aided Chemical engineering . 2019

机译：批量学习批量学习，用于批量生生物过程优化
5. Provable and Efficient Algorithms for Federated, Batch and Reinforcement Learning [D] . Ghosh, Avishek. 2021

机译：用于联合，批量和强化学习的可证明和高效的算法
6. Batch Mode Reinforcement Learning based on the Synthesis of Artificial Trajectories [O] . Raphael Fonteneau, Susan A. Murphy, Louis Wehenkel, -1

机译：基于人工轨迹合成的批处理模式强化学习
7. Batch Reinforcement Learning on a RoboCup SSL Keepaway Strategy Learning Problem [O] . Franco Ollino, Miguel A. Solís, Héctor Allende 2019

机译：批量加固在Robocup SSL Keepaway战略学习问题上的学习

Batch Reinforcement Learning with State Importance

摘要

著录项

相似文献

相关主题

期刊订阅