首页> 外文OA文献 >Toward a classification of finite partial-monitoring games
【2h】

Toward a classification of finite partial-monitoring games

机译:朝着有限部分监测游戏的分类

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Partial-monitoring games constitute a mathematical framework for sequentialdecision making problems with imperfect feedback: The learner repeatedlychooses an action, opponent responds with an outcome, and then the learnersuffers a loss and receives a feedback signal, both of which are fixedfunctions of the action and the outcome. The goal of the learner is to minimizehis total cumulative loss. We make progress towards the classification of thesegames based on their minimax expected regret. Namely, we classify almost allgames with two outcomes and finite number of actions: We show that theirminimax expected regret is either zero, $widetilde{Theta}(sqrt{T})$,$Theta(T^{2/3})$, or $Theta(T)$ and we give a simple and efficientlycomputable classification of these four classes of games. Our hope is that theresult can serve as a stepping stone toward classifying all finitepartial-monitoring games.
机译:部分监控游戏构成了序列的数学框架,用于序列的反馈问题:学习者重复一个动作,对手用结果响应,然后学习损失并接收到反馈信号,两者都是动作的固定禁止结果。学习者的目标是最小化总累积损失。我们基于最低限度预期遗憾,我们对Spiceames分类进行了进展。即,我们分类了两个结果和有限次数的所有方法:我们展示了他们的inminimax预期遗憾是零,$ widetilde { theta}( sqrt {t})$,$ theta(t ^ {2 / 3})$,或$ theta(t)$,我们为这四个课堂进行了简单而有效的追查分类。我们的希望是,审查可以作为分类所有限制监测游戏的踏脚石。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号