首页> 外文会议>Smart Grid Conference >A deep structure for option discovery in reinforcement learning

【24h】

A deep structure for option discovery in reinforcement learning

机译：强化学习中选项发现的深层结构

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Hierarchical learning as another way to scale up reinforcement learning and enable its applications to very hard learning problems. Hierarchical learning is a divide-and-conquer technique a complex learning problem is decomposed into small pieces so that they can be easily solved. The option framework is one way to using hierarchical learning in reinforcement learning. In this paper we have used the free-energy based function approximation (FE-RBM) to determine the option initiation set. Our proposed method calculates the output for each of the input (including state and subgoal) according to negative free energy of an RBM. Learning is done by stochastic gradient descent and mean-squared error. The experimental results showed that this method has efficient functionality to create options. Moreover, it has a reasonable generalization ability for unvisited states.

机译：分层学习是扩大强化学习并使其适用于非常困难的学习问题的另一种方法。分层学习是一种分而治之的技术，它将复杂的学习问题分解成小块，以便可以轻松解决。选项框架是在强化学习中使用分层学习的一种方法。在本文中，我们使用了基于自由能的函数逼近（FE-RBM）来确定期权启动集。我们提出的方法根据RBM的负自由能为每个输入（包括状态和子目标）计算输出。学习是通过随机梯度下降和均方误差完成的。实验结果表明，该方法具有创建选项的高效功能。而且，它对于未访问状态具有合理的泛化能力。

著录项

来源
《Smart Grid Conference》|2016年|1-4|共4页
会议地点
作者
Jahanbakhsh Mohammadi;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Decision support systems; Manganese; Learning (artificial intelligence); Function approximation;

机译：决策支持系统;锰;学习（人工智能）;功能逼近;

相似文献

外文文献
中文文献
专利

1. A Laplacian Framework for Option Discovery in Reinforcement Learning [J] . Marlos C. Machado, Marc G. Bellemare, Michael Bowling JMLR: Workshop and Conference Proceedings . 2017,第1期

机译：增强学习中的选项发现的拉普拉斯框架
2. A Deep Learning Algorithm for the Max-Cut Problem Based on Pointer Network Structure with Supervised Learning and Reinforcement Learning Strategies [J] . Shenshen Gu, Yue Yang Mathematics . 2020,第2期

机译：一种深入学习算法，基于指针网络结构与监督学习和加固学习策略
3. Network-wide traffic signal control based on the discovery of critical nodes and deep reinforcement learning [J] . Xu Ming, Wu Jianping, Huang Ling, Journal of Intelligent Transportation Systems . 2020,第1a6期

机译：基于关键节点和深度增强学习的网络宽的交通信号控制
4. A deep structure for option discovery in reinforcement learning [C] . Jahanbakhsh Mohammadi Smart Grids Conference . 2016

机译：钢筋学习中选择发现的深度结构
5. On Deep Reinforcement Learning for Games: Generalization of Deep Q-Learning with Multiple Policy Heads [D] . Boucher, Mathieu. 2020

机译：关于游戏的深度加固学习：多重政策头部深度Q学的泛化
6. Deep Docking: A Deep Learning Platform for Augmentationof Structure Based Drug Discovery [O] . Francesco Gentile, Vibudh Agrawal, Michael Hsing, 2020

机译：深度扩展坞：增强的深度学习平台基于结构的药物发现
7. Classifying options for deep reinforcement learning [O] . Arulkumaran K, Dilokthanakul N, Shanahan M, 2016

机译：深度强化学习的分类选项

A deep structure for option discovery in reinforcement learning

摘要

著录项

相似文献

相关主题

期刊订阅