A Time and Space Efficient Algorithm for Contextual Linear Bandits

机译：上下文线性强盗的时空高效算法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider a multi-armed bandit problem where payoffs are a linear function of an observed stochastic contextual variable. In the scenario where there exists a gap between optimal and suboptimal rewards, several algorithms have been proposed that achieve O(log T) regret after T time steps. However, proposed methods either have a computation complexity per iteration that scales linearly with T or achieve regrets that grow linearly with the number of contexts ｜X｜. We propose an e-greedy type of algorithm that solves both limitations. In particular, when contexts are variables in R~d, we prove that our algorithm has a constant computation complexity per iteration of O(poly(d)) and can achieve a regret of O(poly(d) log T) even when ｜X｜ = Ω(2~d). In addition, unlike previous algorithms, its space complexity scales like O(Kd~2) and does not grow with T.

机译：我们考虑一个多武装的土匪问题，其中收益是观察到的随机上下文变量的线性函数。在最佳奖励与次优奖励之间存在差距的情况下，提出了几种算法，这些算法可在T个时间步长后实现O（log T）后悔。然而，所提出的方法要么具有随着T线性缩放的每次迭代的计算复杂度，要么获得随着上下文数量｜ X ｜线性增长的遗憾。我们提出了一种解决这两个局限性的电子贪婪算法。特别是，当上下文是R〜d中的变量时，我们证明了我们的算法在O（poly（d））的每次迭代中具有恒定的计算复杂度，即使在｜的情况下也可以实现O（poly（d）log T）的遗憾。 X ｜ =Ω（2〜d）。另外，与以前的算法不同，它的空间复杂度按O（Kd〜2）缩放，并且不随T增长。

著录项

来源
《European conference on machine learning and knowledge discovery in databases》|2013年|257-272|共16页
会议地点
作者
Jose Bento; Stratis Ioannidis; S. Muthukrishnan; Jinyun Yan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Contextual Linear Bandits; Space and Time Efficiency;

机译：上下文线性强盗;时空效率;

相似文献

外文文献
中文文献
专利

1. Efficient and robust algorithms for adversarial linear contextual bandits [J] . Gergely Neu, Julia Olkhovskaya JMLR: Workshop and Conference Proceedings . 2020,第2010期

机译：对抗性线性上下围匪徒的高效和鲁棒算法
2. Provably Optimal Algorithms for Generalized Linear Contextual Bandits [J] . Lihong Li, Yu Lu, Dengyong Zhou JMLR: Workshop and Conference Proceedings . 2017,第1期

机译：广义线性上下文强盗的可证明最佳算法
3. An Unbiased Offline Evaluation of Contextual Bandit Algorithms with Generalized Linear Models [J] . John Langford, Lihong Li, Taesup Moon, JMLR: Workshop and Conference Proceedings . 2012,第2012期

机译：广义线性模型对上下文强盗算法的无偏离线评估
4. A Time and Space Efficient Algorithm for Contextual Linear Bandits [C] . Jose Bento, Stratis Ioannidis, S. Muthukrishnan, European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases . 2013

机译：上下文线性匪徒的时间和空间高效算法
5. Algorithms for bandit online linear optimization. [D] . Dani, Varsha. 2008

机译：土匪在线线性优化算法。
6. Hybrid proximal linearized algorithm for the split DC program in infinite-dimensional real Hilbert spaces [O] . Chih-Sheng Chuang, Pei-Jung Yang -1

机译：无限维实希尔伯特空间中分裂DC程序的混合近端线性化算法
7. A Time and Space Efficient Algorithm for Contextual Linear Bandits [O] . Jose ́ Bento, Stratis Ioannidis, S. Muthukrishnan, 2016

机译：一种时间线性强盗的时空有效算法

A Time and Space Efficient Algorithm for Contextual Linear Bandits

摘要

著录项

相似文献

相关主题

期刊订阅