首页> 外文期刊>Machine Learning >Multi-objective multi-armed bandit with lexicographically ordered and satisficing objectives
【24h】

Multi-objective multi-armed bandit with lexicographically ordered and satisficing objectives

机译:具有词典排序和满足目标的多目标多武装匪

获取原文
获取原文并翻译 | 示例
           

摘要

We consider multi-objective multi-armed bandit with (i) lexicographically ordered and (ii) satisficing objectives. In the first problem, the goal is to select arms that are lexicographic optimal as much as possible without knowing the arm reward distributions beforehand. We capture this goal by defining a multi-dimensional form of regret that measures the loss due to not selecting lexicographic optimal arms, and then, propose an algorithm that achieves (O) over tilde (T-2/3) gap-free regret and prove a regret lower bound of Omega(T-2/3). We also consider two additional settings where the learner has prior information on the expected arm rewards. In the first setting, the learner only knows for each objective the lexicographic optimal expected reward. In the second setting, it only knows for each objective a near-lexicographic optimal expected reward. For both settings, we prove that the learner achieves expected regret uniformly bounded in time. Then, we show that the algorithm we propose for the second setting of lexicographically ordered objectives with prior information also attains bounded regret for satisficing objectives. Finally, we experimentally evaluate the proposed algorithms in a variety of multi-objective learning problems.
机译:我们认为多目标多武装强盗与(i)词典有序和(ii)令人满意的目标。在第一个问题中,目标是选择在不知道ARM奖励分布的情况下尽可能多地选择词典最佳的武器。我们通过定义多维形式的多维形式来捕获这一目标,以衡量由于未选择词典最优臂而导致的损失,然后提出一种算法,该算法在Tilde(T-2/3)无间隙后悔和over证明欧米茄(T-2/3)的遗憾下限。我们还考虑两个其他设置,其中学习者有关于预期ARM奖励的现有信息。在第一个设置中,学习者只知道每个目标的内容是词典最佳预期奖励。在第二个设置中,它只知道每个目标的近似词汇最佳预期奖励。对于这两个设置,我们证明了学习者实现了预期的后悔及时统一。然后,我们表明,我们提出与先前信息的阐述词典有序目标的第二个设定的算法也达到了满足目标的有界遗憾。最后,我们通过实验评估了各种多目标学习问题的所提出的算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号