首页> 美国卫生研究院文献>other >Saccade selection when reward probability is dynamically manipulated using Markov chains
【2h】

Saccade selection when reward probability is dynamically manipulated using Markov chains

机译:使用马尔可夫链动态操纵奖励概率时的扫视选择

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Markov chains (stochastic processes where probabilities are assigned based on the previous outcome) are commonly used to examine the transitions between behavioral states, such as those that occur during foraging or social interactions. However, relatively little is known about how well primates can incorporate knowledge about Markov chains into their behavior. Saccadic eye movements are an example of a simple behavior influenced by information about probability, and thus are good candidates for testing whether subjects can learn Markov chains. In addition, when investigating the influence of probability on saccade target selection, the use of Markov chains could provide an alternative method that avoids confounds present in other task designs. To investigate these possibilities, we evaluated human behavior on a task in which stimulus reward probabilities were assigned using a Markov chain. On each trial, the subject selected one of four identical stimuli by saccade; after selection, feedback indicated the rewarded stimulus. Each session consisted of 200–600 trials, and on some sessions, the reward magnitude varied. On sessions with a uniform reward, subjects (n = 6) learned to select stimuli at a frequency close to reward probability, which is similar to human behavior on matching or probability classification tasks. When informed that a Markov chain assigned reward probabilities, subjects (n = 3) learned to select the greatest reward probability more often, bringing them close to behavior that maximizes reward. On sessions where reward magnitude varied across stimuli, subjects (n = 6) demonstrated preferences for both greater reward probability and greater reward magnitude, resulting in a preference for greater expected value (the product of reward probability and magnitude). These results demonstrate that Markov chains can be used to dynamically assign probabilities that are rapidly exploited by human subjects during saccade target selection.
机译:马尔可夫链(基于先前结果分配概率的随机过程)通常用于检查行为状态之间的转换,例如在觅食或社交互动中发生的状态。但是,关于灵长类如何将有关马尔可夫链的知识纳入其行为的知之甚少。眼跳运动是受概率信息影响的简单行为的示例,因此是测试受试者是否可以学习马尔可夫链的良好候选者。此外,在调查概率对扫视目标选择的影响时,使用马尔可夫链可以提供一种替代方法,避免其他任务设计中存在的混淆。为了研究这些可能性,我们评估了一项任务的人类行为,在该任务中,使用马尔可夫链分配了激励奖励概率。在每个试验中,受试者通过扫视从四个相同的刺激中选择一个。选择后,反馈表明奖励的刺激。每个环节包括200-600次试验,在某些环节上,奖励幅度各不相同。在具有统一奖励的会话中,受试者(n = 6)学会了以接近奖励概率的频率选择刺激,这类似于人类在匹配或概率分类任务上的行为。当被告知马尔可夫链分配了奖励概率时,受试者(n = 3)学会了更频繁地选择最大奖励概率,使他们接近最大化奖励的行为。在奖励幅度随刺激而变化的会议上,受试者(n = 6)表现出对更大奖励概率和更大奖励幅度的偏好,从而导致对更大期望值(奖励概率与幅度的乘积)的偏好。这些结果表明,马尔可夫链可用于动态分配概率,这些概率在扫视目标选择过程中被人类对象迅速利用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号