Minimax Concave Penalized Multi-Armed Bandit Model with High-Dimensional Covariates

Xue Wang; Mingcheng Wei; Tao Yao

首页> 外文期刊>JMLR: Workshop and Conference Proceedings >Minimax Concave Penalized Multi-Armed Bandit Model with High-Dimensional Covariates

【24h】

Minimax Concave Penalized Multi-Armed Bandit Model with High-Dimensional Covariates

机译：Minimax禁止惩罚惩罚多武装强盗模型，具有高维协调因子

获取原文

获取外文期刊封面封底 >>

开具论文收录证明 >>

文献代查 >>

团队文献服务 >>

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

In this paper, we propose a Minimax Concave Penalized Multi-Armed Bandit (MCP-Bandit) algorithm for a decision-maker facing high-dimensional data with latent sparse structure in an online learning and decision-making process. We demonstrate that the MCP-Bandit algorithm asymptotically achieves the optimal cumulative regret in sample size T, O(log T), and further attains a tighter bound in both covariates dimension d and the number of significant covariates s, O(s^2 (s + log d). In addition, we develop a linear approximation method, the 2-step Weighted Lasso procedure, to identify the MCP estimator for the MCP-Bandit algorithm under non-i.i.d. samples. Using this procedure, the MCP estimator matches the oracle estimator with high probability. Finally, we present two experiments to benchmark our proposed the MCP-Bandit algorithm to other bandit algorithms. Both experiments demonstrate that the MCP-Bandit algorithm performs favorably over other benchmark algorithms, especially when there is a high level of data sparsity or when the sample size is not too small.

机译：在本文中，我们提出了一种最小凹陷的多武装强盗（MCP-Biblit）算法，用于在在线学习和决策过程中具有潜在稀疏结构的高维数据的决策者。我们证明了MCP-Biberit算法渐近地实现了样本大小T，O（LOG T）的最佳累积遗憾，进一步达到了协变量维度D的更紧密的界限和显着的协变量S，O（S ^ 2（ S + log d）。此外，我们开发了线性近似方法，2步加权套索程序，以识别非IID样本下MCP-Birtit算法的MCP估计。使用此过程，MCP估计器匹配Oracle估算器具有很高的概率。最后，我们提出了两个实验，将我们提出的MCP-Birtit算法基准测试到其他强盗算法。这两个实验表明，MCP-Birtit算法在其他基准算法上表现出有利地，特别是当存在高水平时数据稀疏性或样本大小不会太小。

著录项

来源
《JMLR: Workshop and Conference Proceedings》 |2018年第2010期|共9页
作者
Xue Wang; Mingcheng Wei; Tao Yao;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. The K-Nearest Neighbour UCB Algorithm for Multi-Armed Bandits with Covariates [J] . Henry Reeve, Joe Mellor, Gavin Brown JMLR: Workshop and Conference Proceedings . 2018,第12期

机译：具有协变量的多武装土匪的K最近邻UCB算法
2. The multi-armed bandit problem with covariates [J] . Perchet V., Rigollet P. The Annals of Statistics: An Official Journal of the Institute of Mathematical Statistics . 2013,第2期

机译：协变量的多臂土匪问题
3. Randomized allocation with nonparametric estimation for a multi-armed bandit problem with covariates [J] . Yang YH., Zhu D. The Annals of Statistics: An Official Journal of the Institute of Mathematical Statistics . 2002,第1期

机译：具有协变量的多臂匪问题的具有非参数估计的随机分配
4. Dynamic Multi-Armed Bandit with Covariates [C] . Nicos G. Pavlidis, Dimitris K. Tasoulis, Niall M. Adams, European Conference on Artificial Intelligence . 2008

机译：动态多武装强盗与协变量
5. Optimization of the Multi-armed Bandit Problem with Graphical Models: a Bayesian Perspective [D] . 趙辰 2019

机译：用图形模型优化多臂匪问题：贝叶斯观点
6. Multi-armed Bandit Models for the Optimal Design of Clinical Trials: Benefits and Challenges [O] . Sofía S. Villar, Jack Bowden, James Wason -1

机译：用于临床试验优化设计的多臂Bandit模型：好处和挑战
7. Penalized Smoothed Partial Rank Estimator for the Nonparametric Transformation Survival Model with High-dimensional Covariates [O] . Dai Wei, Li Yi 2013

机译：具有高维协变量的非参数转换生存模型的惩罚平滑部分秩估计

Minimax Concave Penalized Multi-Armed Bandit Model with High-Dimensional Covariates

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅