首页> 外文学位 >Sequential Optimization in Changing Environments: Theory and Application to Online Content Recommendation Services.
【24h】

Sequential Optimization in Changing Environments: Theory and Application to Online Content Recommendation Services.

机译:不断变化的环境中的顺序优化:在线内容推荐服务的理论和应用。

获取原文
获取原文并翻译 | 示例

摘要

Recent technological developments allow the online collection of valuable information that can be efficiently used to optimize decisions "on the fly" and at a low cost. These advances have greatly influenced the decision-making process in various areas of operations management, including pricing, inventory, and retail management. In this thesis we study methodological as well as practical aspects arising in online sequential optimization in the presence of such real-time information streams. On the methodological front, we study aspects of sequential optimization in the presence of temporal changes, such as designing decision making policies that adopt to temporal changes in the underlying environment (that drives performance) when only partial information about this changing environment is available, and quantifying the added complexity in sequential decision making problems when temporal changes are introduced. On the applied front, we study practical aspects associated with a class of online services that focus on creating customized recommendations (e.g., Amazon, Netflix). In particular, we focus on online content recommendations , a new class of online services that allows publishers to direct readers from articles they are currently reading to other web-based content they may be interested in, by means of links attached to said article.;In the first part of the thesis we consider a non-stationary variant of a sequential stochastic optimization problem, where the underlying cost functions may change along the horizon. We propose a measure, termed variation budget, that controls the extent of said change, and study how restrictions on this budget impact achievable performance. As a yardstick to quantify performance in non-stationary settings we propose a regret measure relative to a dynamic oracle benchmark. We identify sharp conditions under which it is possible to achieve long-run-average optimality and more refined performance measures such as rate optimality that fully characterize the complexity of such problems. In doing so, we also establish a strong connection between two rather disparate strands of literature: adversarial online convex optimization; and the more traditional stochastic approximation paradigm (couched in a non-stationary setting). This connection is the key to deriving well performing policies in the latter, by leveraging structure of optimal policies in the former. Finally, tight bounds on the minimax regret allow us to quantify the "price of non-stationarity," which mathematically captures the added complexity embedded in a temporally changing environment versus a stationary one.;In the second part of the thesis we consider another core stochastic optimization problem couched in a multi-armed bandit (MAB) setting. We develop a MAB formulation that allows for a broad range of temporal uncertainties in the rewards, characterize the (regret) complexity of this class of MAB problems by establishing a direct link between the extent of allowable reward "variation" and the minimal achievable worst-case regret, and provide an optimal policy that achieves that performance. Similarly to the first part of the thesis, our analysis draws concrete connections between two strands of literature: the adversarial and the stochastic MAB frameworks.;The third part of the thesis studies applied optimization aspects arising in online content recommendations, that allow web-based publishers to direct readers from articles they are currently reading to other web-based content. We study the content recommendation problem and its unique dynamic features from both theoretical as well as practical perspectives. Using a large data set of browsing history at major media sites, we develop a representation of content along two key dimensions: clickability, the likelihood to click to an article when it is recommended; and engageability, the likelihood to click from an article when it hosts a recommendation. Based on this representation, we propose a class of user path-focused heuristics, whose purpose is to simultaneously ensure a high instantaneous probability of clicking recommended articles, while also optimizing engagement along the future path. We rigorously quantify the performance of these heuristics and validate their impact through a live experiment. The third part of the thesis is based on a collaboration with a leading provider of content recommendations to online publishers.
机译:最近的技术发展允许在线收集有价值的信息,这些信息可以有效地用于“即时”以低成本优化决策。这些进步极大地影响了运营管理各个领域的决策过程,包括定价,库存和零售管理。在本文中,我们研究了在存在此类实时信息流的情况下,在线顺序优化产生的方法论和实践方面。在方法学方面,我们研究存在时间变化的情况下顺序优化的各个方面,例如设计决策策略,以便仅在有关该变化环境的部分信息可用时,采用基础环境的时间变化(驱动性能),并且当引入时间变化时,量化顺序决策问题中增加的复杂性。在应用方面,我们研究与一类专注于创建自定义推荐的在线服务相关的实际方面(例如,Amazon,Netflix)。特别是,我们专注于在线内容推荐,这是一类新的在线服务,它使发布者可以通过附加到该文章的链接将当前正在阅读的文章的读者引向他们可能感兴趣的其他基于网络的内容。在本文的第一部分,我们考虑了顺序随机优化问题的一个非平稳变体,其中潜在的成本函数可能会随时间变化。我们提出了一种称为变更预算的措施,该措施可控制上述变更的程度,并研究对该预算的限制如何影响可实现的效果。作为量化非平稳环境下性能的标准,我们提出了相对于动态Oracle基准的后悔措施。我们确定了可以实现长期平均最佳状态的尖锐条件,以及可以更好地表征此类问题复杂性的更精细的性能指标,例如速率最佳状态。这样,我们还可以在两个截然不同的文献之间建立牢固的联系:对抗性在线凸优化;以及更传统的随机逼近范例(在非平稳环境下进行)。通过利用前者中最佳策略的结构,这种联系是在后者中获得运行良好的策略的关键。最后,对极小极大遗憾的严格限制使我们能够量化“非平稳性的价格”,这从数学上捕获了随时间变化的环境相对于平稳环境所嵌入的增加的复杂性;在本文的第二部分,我们考虑了另一个核心多武装匪徒(MAB)设置中出现的随机优化问题。我们开发了一种MAB公式,该公式可在奖励中提供广泛的时间不确定性,并通过在允许的奖励“变化”的程度与可实现的最小差值之间建立直接联系来表征此类MAB问题的(遗憾)复杂性。案例遗憾,并提供实现该性能的最佳策略。与论文的第一部分相似,我们的分析在两类文献之间建立了具体的联系:对抗性和随机性的MAB框架。论文的第三部分研究了在线内容推荐中出现的优化方面,这些方面允许基于Web出版商将读者从他们当前正在阅读的文章引导到其他基于Web的内容。我们从理论和实践的角度研究内容推荐问题及其独特的动态特征。我们使用主要媒体站点上的大量浏览历史数据集,从两个关键维度来开发内容的表示形式:可点击性,推荐时点击某篇文章的可能性;和互动性,即当文章包含推荐时被点击的可能性。基于此表示,我们提出了一类以用户路径为中心的启发式方法,其目的是同时确保单击推荐文章的高瞬时概率,同时还优化沿未来路径的参与度。我们严格量化这些启发式方法的性能,并通过实时实验验证其影响。论文的第三部分基于与领先的向在线出版商提供内容推荐的提供商的合作。

著录项

  • 作者

    Gur, Yonatan.;

  • 作者单位

    Columbia University.;

  • 授予单位 Columbia University.;
  • 学科 Operations Research.;Business Administration Management.;Applied Mathematics.;Business Administration Marketing.;Web Studies.
  • 学位 Ph.D.
  • 年度 2014
  • 页码 143 p.
  • 总页数 143
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号