首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Online Multiclass Boosting with Bandit Feedback
【24h】

Online Multiclass Boosting with Bandit Feedback

机译:带有强盗反馈的在线多类别提升

获取原文
           

摘要

We present online boosting algorithms for multiclass classification with bandit feedback, where the learner only receives feedback about the correctness of its prediction. We propose an unbiased estimate of the loss using a randomized prediction, allowing the model to update its weak learners with limited information. Using the unbiased estimate, we extend two full information boosting algorithms (Jung et al., 2017) to the bandit setting. We prove that the asymptotic error bounds of the bandit algorithms exactly match their full information counterparts. The cost of restricted feedback is reflected in the larger sample complexity. Experimental results also support our theoretical findings, and performance of the proposed models is comparable to that of an existing bandit boosting algorithm, which is limited to use binary weak learners.
机译:我们提出了带有强盗反馈的在线多分类算法,其中学习者仅收到有关其预测正确性的反馈。我们建议使用随机预测对损失进行无偏估计,从而使模型可以用有限的信息更新其弱学习者。使用无偏估计,我们将两个完整的信息增强算法(Jung等,2017)扩展到了强盗设置。我们证明了强盗算法的渐近误差范围与它们的全部信息完全匹配。有限反馈的成本反映在更大的样本复杂度上。实验结果也支持了我们的理论发现,并且所提出的模型的性能可与现有的匪徒增强算法相媲美,后者仅限于使用二进制弱学习者。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号