How pupil responses track value-based decision-making during and after reinforcement learning

Joanne C. Van Slooten; Sara Jahfari; Tomas Knapen; Jan Theeuwes

首页> 外文期刊>PLoS Computational Biology >How pupil responses track value-based decision-making during and after reinforcement learning

【24h】

How pupil responses track value-based decision-making during and after reinforcement learning

机译：在强化学习期间和之后，学生的反应如何跟踪基于价值的决策

获取原文

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Author summary It has long been known that the pupil dilates when we decide. These pupil dilations have predominantly been linked to arousal. However, reward-related processes may trigger pupil dilations as well, as dilations have been linked to activity in the dopaminergic midbrain, a region important for reward processing and reinforcement learning. Using a learning task and a computational model to quantitatively describe the cognitive processes that drive reinforcement learning behavior, we show that the pupil closely tracks different aspects of the reinforcement learning process. Prior to making a value-based choice, pupil dilation reflected the value of the soon-to-be-chosen option. After receiving choice feedback, early dilation reflected uncertainty about the value of recent choice options, while late constriction reflected how strongly an outcome violated current value beliefs. These findings provide the novel insight that the pupil can be used to track value-based decision-making, opening up a new method for online tracking of reinforcement learning processes.

机译：作者摘要早就知道，当我们做出决定时，瞳孔会扩大。这些瞳孔扩张主要与唤醒有关。但是，与奖励相关的过程也可能触发瞳孔扩张，因为扩张与多巴胺能中脑的活动有关，该区域对奖励过程和强化学习很重要。使用学习任务和计算模型来定量描述驱动强化学习行为的认知过程，我们表明学生密切跟踪强化学习过程的不同方面。在做出基于价值的选择之前，瞳孔扩大反映了即将被选择的选择的价值。收到选择反馈后，早期扩张反映出近期选择权价值的不确定性，而后期收缩反映出结果违反当前价值信念的强烈程度。这些发现提供了新颖的见解，即学生可以用来跟踪基于价值的决策，从而开辟了一种在线跟踪强化学习过程的新方法。

著录项

来源
《PLoS Computational Biology》 |2018年第11期|共24页
作者
Joanne C. Van Slooten; Sara Jahfari; Tomas Knapen; Jan Theeuwes;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类细胞生物学;
关键词
Decision makingLearningCognitionHuman learningPermutationEye movementsAttentionDopamine;

机译：决策学习认知人类学习排列眼睛运动注意多巴胺;

相似文献

外文文献
中文文献
专利

1. Changes of Attention during Value-Based Reversal Learning Are Tracked by N2pc and Feedback-Related Negativity [J] . Mariann Oemisch, Marcus R. Watson, Thilo Womelsdorf, Frontiers in Human Neuroscience . 2017,第1期

机译：N2pc和与反馈相关的负性可追踪基于价值的逆向学习过程中注意力的变化
2. Energy management based on reinforcement learning with double deep Q- learning for a hybrid electric tracked vehicle [J] . Han Xuefeng, He Hongwen, Wu Jingda, Applied Energy . 2019,第Nova15期

机译：基于强化学习和双深度Q学习的混合动力电动履带车能源管理
3. Adaptive Fault-Tolerant Tracking Control for MIMO Discrete-Time Systems via Reinforcement Learning Algorithm With Less Learning Parameters [J] . Lei Liu, Zhanshan Wang, Huaguang Zhang Automation Science and Engineering, IEEE Transactions on . 2017,第1期

机译：通过具有较少学习参数的强化学习算法对MIMO离散时间系统进行自适应容错跟踪控制
4. Tracking as Online Decision-Making: Learning a Policy from Streaming Videos with Reinforcement Learning [C] . James Supancic, Deva Ramanan IEEE International Conference on Computer Vision . 2017

机译：以在线决策跟踪：学习通过强化学习的流媒体视频的政策
5. Individual Differences in Value-Based Decision-Making: Learning and Time Preference. [D] . Pehlivanova, Marieta. 2017

机译：基于价值的决策中的个体差异：学习和时间偏好。
6. Correction: How pupil responses track value-based decision-making during and after reinforcement learning [O] . Joanne C. Van Slooten, Sara Jahfari, Tomas Knapen, 2019

机译：纠正：强化学习期间和之后学生的反应如何跟踪基于价值的决策
7. Pupil responses as indicators of value-based decision-making [O] . Joanne C. Van Slooten, Sara Jahfari, Tomas Knapen, 2018

机译：瞳孔反应作为基于价值的决策指标
8. Predicting What Reinforcement Learning Will Tell You: A Model of Human Decision-Making in Multi-Stage Games. [R] . B. Tracey D. H. Wolpert J. Bono R. Lee R. W. Bent S. N. Backhaus 2011

机译：预测强化学习将告诉你什么：多阶段博弈中的人类决策模型。

How pupil responses track value-based decision-making during and after reinforcement learning

摘要

著录项

相似文献

相关主题

期刊订阅