首页> 外文期刊>電子情報通信学会技術研究報告. 情報論的学習理論と機械学習 >New Feature Selection Method for Reinforcement Learning: Conditional Mutual Information Reveals Implicit State-Reward Dependency
【24h】

New Feature Selection Method for Reinforcement Learning: Conditional Mutual Information Reveals Implicit State-Reward Dependency

机译:强化学习的新特征选择方法:条件互信息揭示了隐式状态-奖励依赖性

获取原文
获取原文并翻译 | 示例
           

摘要

Model-free reinforcement learning (RL) is a machine learning approach to decision making in unknown environment. However, real-world RL tasks often involve high-dimensional state space, and then standard RL methods do not perform well. In this paper, we propose a new feature selection framework for coping with high dimensionality. Our proposed framework adopts conditional mutual information between state and return sequences as a feature selection criterion, allowing the evaluation of implicit state-reward dependency. The conditional mutual information is approximated by a least-squares method, which results in a computationally efficient feature selection procedure. The usefulness of the proposed method is demonstrated on simulated mobile-robot navigation experiments.
机译:无模型强化学习(RL)是一种在未知环境中进行决策的机器学习方法。但是,现实中的RL任务通常涉及高维状态空间,因此标准的RL方法效果不佳。在本文中,我们提出了一个新的特征选择框架来应对高维。我们提出的框架采用状态和返回序列之间的条件互信息作为特征选择标准,从而允许评估隐式状态-奖励依赖性。条件互信息通过最小二乘法来近似,这导致计算效率高的特征选择过程。仿真的移动机器人导航实验证明了该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号