首页> 外文会议>Annual Conference on Learning Theory(COLT 2006); 20060622-25; Pittsburgh,PA(US) >Logarithmic Regret Algorithms for Online Convex Optimization
【24h】

Logarithmic Regret Algorithms for Online Convex Optimization

机译:在线凸优化的对数后悔算法

获取原文
获取原文并翻译 | 示例

摘要

In an online convex optimization problem a decision-maker makes a sequence of decisions, i.e., chooses a sequence of points in Euclidean space, from a fixed feasible set. After each point is chosen, it encounters a sequence of (possibly unrelated) convex cost functions. Zinkevich [Zin03] introduced this framework, which models many natural repeated decision-making problems and generalizes many existing problems such as Prediction from Expert Advice and Cover's Universal Portfolios. Zinkevich showed that a simple online gradient descent algorithm achieves additive regret O(T~(1/2)), for an arbitrary sequence of T convex cost functions (of bounded gradients), with respect to the best single decision in hindsight. In this paper, we give algorithms that achieve regret O(log(T)) for an arbitrary sequence of strictly convex functions (with bounded first and second derivatives). This mirrors what has been done for the special cases of prediction from expert advice by Kivinen and Warmuth [KW99], and Universal Portfolios by Cover [Cov91]. We propose several algorithms achieving logarithmic regret, which besides being more general are also much more efficient to implement. The main new ideas give rise to an efficient algorithm based on the Newton method for optimization, a new tool in the field. Our analysis shows a surprising connection to follow-the-leader method, and builds on the recent work of Agarwal and Kazan [AH05]. We also analyze other algorithms, which tie together several different previous approaches including follow-the-leader, exponential weighting, Cover's algorithm and gradient descent.
机译:在在线凸优化问题中,决策者做出一系列决策,即从固定的可行集中选择欧几里得空间中的一系列点。选择每个点后,它将遇到一系列(可能不相关的)凸成本函数。 Zinkevich [Zin03]引入了此框架,该框架为许多自然的重复决策问题建模,并概括了许多现有问题,例如“专家建议的预测”和Cover的Universal Portfolios。 Zinkevich表明,对于事后看来的最佳单一决策,对于T凸成本函数(边界梯度)的任意序列,一种简单的在线梯度下降算法可以实现加性后悔O(T〜(1/2))。在本文中,我们给出了针对任意凸函数(具有有界一阶和二阶导数)的任意序列实现后悔O(log(T))的算法。这反映了Kivinen和Warmuth [KW99]的专家建议以及Cover [Cov91]的Universal Portfolios对特殊情况的预测所做的工作。我们提出了几种实现对数后悔的算法,这些算法除了更通用之外,实现起来也更加高效。主要的新思想产生了一种基于牛顿法进行优化的高效算法,这是本领域的一种新工具。我们的分析显示了与跟随者方法的惊人联系,并基于Agarwal和Kazan [AH05]的最新工作。我们还分析了其他算法,这些算法将以前的几种不同方法结合在一起,包括跟随者,指数加权,Cover算法和梯度下降。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号