首页> 外文期刊>Neurocomputing >Reinforcement-learning agents with different temperature parameters explain the variety of human action- selection behavior in a Markov decision process task
【24h】

Reinforcement-learning agents with different temperature parameters explain the variety of human action- selection behavior in a Markov decision process task

机译:具有不同温度参数的强化学习剂解释了马尔可夫决策过程任务中人类动作选择行为的多样性

获取原文
获取原文并翻译 | 示例
           

摘要

We investigated the characteristics of the human action-selection in performing a Markov decision process (MDP) task, and compared them to those of reinforcement-learning (RL) agents. The behavior of human participants was roughly classified into two qualitatively different types. On the other hand, surprisingly, the variety of human behavior could be explained simply by a single parameter of the degree of randomness (i.e., the temperature parameter) in the action-selection rules of the RL agents. This result implies that the various behaviors of human action-selection may be determined by a simple mechanism in the brain.
机译:我们调查了在执行马尔可夫决策过程(MDP)任务中人类动作选择的特征,并将其与强化学习(RL)代理的特征进行了比较。人类参与者的行为大致分为两种性质不同的类型。另一方面,令人惊讶地,人类行为的多样性可以简单地由RL试剂的动作选择规则中的随机度的单个参数(即温度参数)来解释。该结果暗示人类动作选择的各种行为可以通过大脑中的简单机制来确定。

著录项

  • 来源
    《Neurocomputing》 |2009年第9期|1979-1984|共6页
  • 作者单位

    Graduate School of Information Systems, University of Electro-Communications, 1-5-1 Chofu-ga-oka, Chofu, Tokyo 182-8585, Japan;

    Graduate School of Information Systems, University of Electro-Communications, 1-5-1 Chofu-ga-oka, Chofu, Tokyo 182-8585, Japan;

    Graduate School of Information Systems, University of Electro-Communications, 1-5-1 Chofu-ga-oka, Chofu, Tokyo 182-8585, Japan;

    Graduate School of Information Systems, University of Electro-Communications, 1-5-1 Chofu-ga-oka, Chofu, Tokyo 182-8585, Japan;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    action selection; human behavior; markov decision process; reinforcement learning; inverse temperature parameter;

    机译:动作选择;人类行为;马可夫决策过程;强化学习;逆温度参数;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号