【24h】

Defensive Universal Learning with Experts

机译:与专家进行防御普遍学习

获取原文

摘要

This paper shows how universal learning can be achieved with expert advice. To this aim, we specify an experts algorithm with the following characteristics: (a) it uses only feedback from the actions actually chosen (bandit setup), (b) it can be applied with countably infinite expert classes, and (c) it copes with losses that may grow in time appropriately slowly. We prove loss bounds against an adaptive adversary. Prom this, we obtain a master algorithm for "reactive" experts problems, which means that the master's actions may influence the behavior of the adversary. Our algorithm can significantly outperform standard experts algorithms on such problems. Finally, we combine it with a universal expert class. The resulting universal learner performs - in a certain sense - almost as well as any computable strategy, for any online decision problem. We also specify the (worst-case) convergence speed, which is very slow.
机译:本文显示了如何通过专家建议实现普遍学习。为此,我们指定具有以下特征的专家算法:(a)它只使用从实际选择的操作(babit setup)的反馈,(b)它可以应用于无数的专家类,以及(c)它损失可能会缓慢缓慢增长。我们证明对适应性对手的损失界限。舞会,我们获得了一个“无功”专家问题的主算法,这意味着硕士的行为可能会影响对手的行为。我们的算法可以显着优异地优于标准专家对这些问题的标准专家算法。最后,我们将其与普遍的专家课结合起来。由此产生的通用学习者执行 - 在某种意义上 - 几乎和任何可计算的策略,对于任何在线决策问题。我们还指定了(最坏情况)收敛速度,这很慢。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号