首页> 外文期刊>Frontiers in Applied Mathematics and Statistics >Performance in Multi-Armed Bandit Tasks in Relation to Ambiguity-Preference Within a Learning Algorithm
【24h】

Performance in Multi-Armed Bandit Tasks in Relation to Ambiguity-Preference Within a Learning Algorithm

机译:学习算法中与歧义偏好相关的多武装强盗任务的性能

获取原文
       

摘要

Ellsberg paradox in decision theory posits that people will inevitably choose a known probability of winning over an unknown probability of winning even if the known probability is low. One of prevailing theories which addresses the Ellsberg paradox is known as a??ambiguity-aversiona??. In this study, we investigate the properties of ambiguity-aversion in four distinct types of reinforcement learning algorithms: ucb1-tuned, modified ucb1-tuned, softmax, and tug-of-war. We take as our sample a scenario in which there are two slot machines and each machine dispenses a coin according to a probability that is generated by its own probability density function (PDF). We then investigate the choices of a learning algorithm in such multi-armed bandit tasks. There are different reactions in multi-armed bandit tasks, depending on the ambiguity-preference in the learning algorithms. Notably, we discovered clear performance enhancement related to ambiguity-preference in a learning algorithm. Although this study does not directly address the issue of ambiguity-aversion theory highlighted in Ellsberg paradox, the differences between different learning algorithms suggests that there is room for further study regarding the Ellsberg paradox and decision theory.
机译:决策理论中的埃尔斯伯格悖论认为,即使已知概率很低,人们也会不可避免地选择一个已知的获胜概率而不是一个未知的获胜概率。解决埃尔斯伯格悖论的一种流行理论被称为“模糊性厌恶”。在这项研究中,我们研究了四种不同类型的强化学习算法的歧义厌恶特性:ucb1调整,改进的ucb1调整,softmax和拔河。我们以一个场景为例,其中有两个老虎机,每个老虎机根据自己的概率密度函数(PDF)生成的概率分配硬币。然后,我们研究在这种多臂匪徒任务中学习算法的选择。根据学习算法中的歧义偏好,在多武装匪徒任务中会有不同的反应。值得注意的是,我们在学习算法中发现了与歧义偏好相关的明显性能增强。尽管本研究并未直接解决Ellsberg悖论中强调的歧义规避理论的问题,但不同学习算法之间的差异表明,关于Ellsberg悖论和决策理论的研究仍有空间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号