Toward Interpretable Deep Reinforcement Learning with Linear Model U-Trees

机译：借助线性模型U树实现可解释的深度强化学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Deep Reinforcement Learning (DRL) has achieved impressive success in many applications. A key component of many DRL models is a neural network representing a Q function, to estimate the expected cumulative reward following a state-action pair. The Q function neural network contains a lot of implicit knowledge about the RL problems, but often remains unexamined and uninterpreted. To our knowledge, this work develops the first mimic learning framework for Q functions in DRL. We introduce Linear Model U-trees (LMUTs) to approximate neural network predictions. An LMUT is learned using a novel on-line algorithm that is well-suited for an active play setting, where the mimic learner observes an ongoing interaction between the neural net and the environment. Empirical evaluation shows that an LMUT mimics a Q function substantially better than five baseline methods. The transparent tree structure of an LMUT facilitates understanding the network's learned strategic knowledge by analyzing feature influence, extracting rules, and highlighting the super-pixels in image inputs. Code related to this paper is available at: https://github.com/Guiliang/ uTree-mimic_mountain_car.

机译：深度强化学习（DRL）在许多应用中都取得了令人瞩目的成功。许多DRL模型的关键组成部分是代表Q函数的神经网络，用于估计状态-动作对后的预期累积奖励。 Q函数神经网络包含许多有关RL问题的隐性知识，但通常仍未经检查和解释。据我们所知，这项工作为DRL中的Q函数开发了第一个模拟学习框架。我们引入线性模型U树（LMUT）来近似神经网络预测。 LMUT是使用一种新颖的在线算法来学习的，该算法非常适合于主动游戏的设置，其中模仿学习者可以观察到神经网络与环境之间正在进行的交互作用。实证评估表明，LMUT模仿Q函数的效果明显好于五种基线方法。 LMUT的透明树结构通过分析功能影响，提取规则并突出显示图像输入中的超像素，有助于理解网络学到的战略知识。与本文相关的代码可在以下网址获得：https://github.com/Guiliang/ uTree-mimic_mountain_car。

著录项

来源
《European conference on machine learning and principles and practice of knowledge discovery in databases》|2018年|414-429|共16页
会议地点
作者
Guiliang Liu; Oliver Schulte; Wang Zhu; Qingcan Li;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. A model-based deep reinforcement learning method applied to finite-horizon optimal control of nonlinear control-affine system [J] . Journal of Process Control . 2020,第期

机译：一种基于模型的深度加强学习方法，适用于非线性控制仿射系统的有限范围最优控制
2. A whole-process interpretable and multi-modal deep reinforcement learning for diagnosis and analysis of Alzheimer’s disease [J] . Quan Zhang, Qian Du, Guohua Liu Journal of neural engineering . 2021,第6期

机译：用于诊断和分析阿尔茨海默病的全过程解释和多模态深度增强学习
3. Predicting Advertisement Clicks Using Deep Networks: Interpreting Deep Learning Models [J] . Samel Karan Journal of Purdue Undergraduate Research . 2017,第1期

机译：使用深度网络预测广告点击：解释深度学习模型
4. Toward Interpretable Deep Reinforcement Learning with Linear Model U-Trees [C] . Guiliang Liu, Oliver Schulte, Wang Zhu, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases . 2019

机译：通过线性模型U树来解释可解释的深度增强学习
5. Robotic Swarm Control Using Deep Reinforcement Learning Strategies Based on Mean-Field Models [D] . Kakish, Zahi. 2021

机译：基于平均场模型的深增强学习策略，机器人群控制
6. Estimating and interpreting nonlinear receptive field of sensory neural responses with deep neural network models [O] . Menoua Keshishian, Hassan Akbari, Bahar Khalighinejad, 2020

机译：深神经网络模型估计和解释感觉神经响应的非线性接收领域
7. Toward Interpretable Deep Reinforcement Learning with Linear Model U-Trees [O] . Guiliang Liu, Oliver Schulte, Wang Zhu, 2019

机译：通过线性模型U树来解释可解释的深度增强学习

Toward Interpretable Deep Reinforcement Learning with Linear Model U-Trees

摘要

著录项

相似文献

相关主题

期刊订阅