Learning Contextual Reward Expectations for Value Adaptation

Francesco Rigoli; Benjamin Chew; Peter Dayan; Raymond J. Dolan

首页> 外文期刊>Journal of Cognitive Neuroscience >Learning Contextual Reward Expectations for Value Adaptation

【24h】

Learning Contextual Reward Expectations for Value Adaptation

机译：学习情境奖励对价值适应的期望

获取原文

获取原文并翻译 | 示例

开具论文收录证明 >>

AI期刊论文写作 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

ara>Substantial evidence indicates that subjective value is adapted to the statistics of reward expected within a given temporal context. However, how these contextual expectations are learned is poorly understood. To examine such learning, we exploited a recent observation that participants performing a gambling task adjust their preferences as a function of context. We show that, in the absence of contextual cues providing reward information, an average reward expectation was learned from recent past experience. Learning dependent on contextual cues emerged when two contexts alternated at a fast rate, whereas both cue-independent and cue-dependent forms of learning were apparent when two contexts alternated at a slower rate. Motivated by these behavioral findings, we reanalyzed a previous fMRI data set to probe the neural substrates of learning contextual reward expectations. We observed a form of reward prediction error related to average reward such that, at option presentation, activity in ventral tegmental area/substantia nigra and ventral striatum correlated positively and negatively, respectively, with the actual and predicted value of options. Moreover, an inverse correlation between activity in ventral tegmental area/substantia nigra (but not striatum) and predicted option value was greater in participants showing enhanced choice adaptation to context. The findings help understanding the mechanisms underlying learning of contextual reward expectation.

机译：大量证据表明，主观价值适合于给定时间范围内预期的报酬统计。但是，如何了解这些上下文期望却知之甚少。为了检验这种学习，我们利用了最近的观察结果，即参加赌博任务的参与者根据情境调整自己的偏好。我们表明，在没有提供奖励信息的上下文提示的情况下，从最近的过去经验中可以获得平均奖励期望。当两个情境以很快的速度交替出现时，就会出现依赖于情境提示的学习，而当两个情境以较低的速度交替出现时，独立于提示和依赖于提示的学习形式就会出现。受这些行为调查结果的激励，我们重新分析了先前的fMRI数据集，以探究学习情境奖励期望的神经基础。我们观察到一种与平均奖励有关的奖励预测误差，这样，在期权呈报时，腹侧被盖区/黑质和腹侧纹状体的活动分别与期权的实际价值和预测价值呈正相关和负相关。此外，参与者的腹侧被盖区/黑质（而不是纹状体）的活动与预测的选项值之间的负相关性更大，表明参与者对背景的选择适应性增强。这些发现有助于理解情境奖励期望学习的潜在机制。

著录项

来源
《Journal of Cognitive Neuroscience》 |2018年第1期|50-69|共20页
作者
Francesco Rigoli; Benjamin Chew; Peter Dayan; Raymond J. Dolan;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Different neural correlates of reward expectation and reward expectation error in the putamen and caudate nucleus during stimulus-action-reward association learning. [J] . Haruno M, Kawato M Journal of Neurophysiology . 2006,第2期

机译：刺激-行动-奖励关联学习过程中，壳核和尾状核中奖励期望和奖励期望误差的不同神经相关。
2. Reward Adaptation and the Mechanisms of Learning: Contrast Changes Reward Value in Rats and Drives Learning [J] . Dwyer Dominic Michael, Figueroa Jaime, Gasalla Patricia, Psychological science: a journal of the American Psychological Society . 2018,第2期

机译：奖励适应与学习机制：对比变化大鼠和驱动学习中的奖励价值
3. Contextual interference during adaptation to asymmetric split-belt treadmill walking results in transfer of unique gait mechanics Contextual interference during adaptation to asymmetric split-belt treadmill walking results in transfer of unique gait mechanics Contextual interference during adaptation to asymmetric split-belt treadmill walking results in transfer of unique gait mechanics [J] . Michael E. Hahn, Jacob W. Hinkel-Lipsker Biology Open . 2017,第12期

机译：适应不对称皮带式跑步机步行过程中的上下文干扰导致了唯一的步态力学转移适应不对称皮带式跑步机步行过程中的上下文干扰导致了唯一的步态力学的转移独特的步态力学
4. Learning Multi-Objective Rewards and User Utility Function in Contextual Bandits for Personalized Ranking [C] . Nirandika Wanigasekara, Yuxuan Liang, Siong Thye Goh, International Joint Conference on Artificial Intelligence . 2020

机译：在上下文匪徒中学习多目标奖励和用户效用功能，用于个性化排名
5. Effects of Nicotine Withdrawal on Motivation, Reward Sensitivity and Reward-Learning. [D] . Oliver, Jason A. 2015

机译：尼古丁戒断对动机，奖励敏感性和奖励学习的影响。
6. Contextual modulation of value signals in reward and punishment learning [O] . Stefano Palminteri, Mehdi Khamassi, Mateus Joffily, -1

机译：奖惩学习中价值信号的语境调节
7. Reward adaptation and the mechanisms of learning: Contrast changes reward value in rats and drives learning [O] . Dwyer, Dominic, Gasalla Canto, Patricia, Figueroa, Jaime, 2017

机译：奖励适应和学习机制：对比度改变大鼠的奖励价值并推动学习
8. Framing Reinforcement Learning from Human Reward: Reward Positivity, Temporal Discounting, Episodicity, and Performance. [R] . Knox, W. B., Stone, P. 2014

机译：从人类奖励中学习强化学习：奖励积极性，时间贴现，情节性和表现。

Learning Contextual Reward Expectations for Value Adaptation

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅