首页> 外文学位 >Online Controlled Experiment Design: Trade-off Between Statistical Uncertainty and Cumulative Reward.

【24h】

Online Controlled Experiment Design: Trade-off Between Statistical Uncertainty and Cumulative Reward.

机译：在线控制实验设计：在统计不确定性和累积奖励之间进行权衡。

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

Online experiment is widely used in online advertising, web development to compare effects, e.g. click through rate, conversion rate, of different versions. Among all the designs, A/B testing is the most popular one. It randomly segments users into two groups with equal probability and shows them different versions. This method is easy to implement. However the shortcoming is also obvious: to measure both versions it cannot expose all users to the best version, which leads to potential loss of rewards, e.g. clicks and conversions. Though this loss is inevitable in experiment, it can be reduced somehow. Reducing the loss is essentially equivalent to maximizing cumulative reward, which is also the goal of typical multi-armed bandit problem. Thus, multi-armed bandit algorithms are proposed to reduce potential loss in experiment. Compared with A/B testing, multi-armed bandit algorithms produce more cumulative reward during experiment. However, they suffer from high statistical uncertainty: e.g. they need more users than A/B testing to reach particular statistical significance level.;To solve this problem, this paper aims at building a model to analyze two conflicting goals: reducing statistical uncertainty and maximizing cumulative reward. We develop an algorithm for online experiment to balance the trade-off between these two goals. Right now our analysis focuses on one kind of online experiment: batch updating binomial experiment. We first discuss several statistical uncertainty criterion and propose corresponding algorithms to optimize these criterion for experiment. Then we extend some multi-armed bandit algorithms to maximizing cumulative reward for batch updating problem. Besides that, we propose an new algorithm: sequential two stages (STS) to solve this problem. After that, an improved performance evaluation method, which integrates statistical uncertainty with cumulative reward, is put forwarded. Instead of simply combining two objective functions, this new measure, virtual future measure (VFM) establishes connection between statistical uncertainty and cumulative reward directly through virtual future reward. Compared with other method, our proposed algorithm STS is adaptable to optimize VFM.

机译：在线实验广泛用于在线广告，网站开发中，以比较效果，例如不同版本的点击率，转化率。在所有设计中，A / B测试是最受欢迎的一种。它以相等的概率将用户随机分为两组，并向他们显示不同的版本。此方法易于实现。但是缺点也很明显：要同时测量这两个版本，就不能使所有用户都接触到最佳版本，从而导致潜在的奖励损失，例如点击和转化。尽管这种损失在实验中是不可避免的，但可以通过某种方式减少。减少损失从本质上讲等于最大化累积报酬，这也是典型的多武装匪徒问题的目标。因此，提出了多臂强盗算法来减少实验中的潜在损失。与A / B测试相比，多臂强盗算法在实验过程中产生更多的累积奖励。但是，他们遭受着很高的统计不确定性：为了达到特定的统计显着性水平，他们需要比A / B测试更多的用户。为了解决此问题，本文旨在建立一个模型来分析两个相互冲突的目标：减少统计不确定性和最大化累积奖励。我们开发了一种在线实验算法，以平衡这两个目标之间的平衡。现在，我们的分析集中于一种在线实验：批处理更新二项式实验。我们首先讨论几种统计不确定性准则，并提出相应的算法以优化这些准则以进行实验。然后，我们扩展了一些多武装的强盗算法，以使批量更新问题的累积奖励最大化。除此之外，我们提出了一种新算法：顺序两个阶段（STS）来解决此问题。在此基础上，提出了一种改进的绩效评估方法，该方法将统计不确定性与累积奖励相结合。虚拟未来测度（VFM）并非简单地组合两个目标函数，而是通过虚拟未来测验直接在统计不确定性和累积收益之间建立联系。与其他方法相比，我们提出的算法STS适用于优化VFM。

著录项

作者
Dai, Liang.;
展开▼
作者单位

University of California, Santa Cruz.;

展开▼
授予单位 University of California, Santa Cruz.;
学科 Information science.;Computer science.
学位 M.S.
年度 2014
页码 67 p.
总页数 67
原文格式 PDF
正文语种 eng
中图分类
关键词

相似文献

外文文献
中文文献
专利

1. Uncertainty in optimal experiment design: comparing an online versus offline approaches [J] . Dries Telen, Philippe Nimmegeers, Jan Van Impe IFAC PapersOnLine . 2018,第2期

机译：最佳实验设计的不确定性：比较在线方法与离线方法
2. Uncertainty in optimal experiment design: comparing an online versus offline approaches [J] . Dries Telen, Philippe Nimmegeers, Jan Van Impe IFAC PapersOnLine . 2018,第2期

机译：最佳实验设计的不确定性：比较在线方法与离线方法
3. On Statistical Design of the Cumulative Quantity Control Chart for Monitoring High Yield Processes [J] . PEI-WEN CHEN, CHUEN-SHENG CHENG Communications in Statistics . 2011,第10a12期

机译：高产过程监控量控制图的统计设计
4. Statistical Inference in Two-Stage Online Controlled Experiments with Treatment Selection and Validation [C] . Alex Deng, Tianxi Li, Yu Guo International conference on world wide web . 2014

机译：选择治疗和验证的两阶段在线控制实验的统计推论
5. Control system design for dynamical systems with statistical model uncertainty. [D] . Huerta-Ochoa, Ruben Tarsicio. 2000

机译：具有统计模型不确定性的动力系统的控制系统设计。
6. Instructional strategies and course design for teaching statistics online: perspectives from online students [O] . Dazhi Yang -1

机译：在线教学统计学的教学策略和课程设计：在线学生的观点
7. Statistical Inference in Two-Stage Online Controlled Experiments with Treatment Selection and Validation [O] . Alex Deng, Tianxi Li, Yu Guo 2015

机译：两阶段在线控制实验的统计推断及处理选择和验证

Online Controlled Experiment Design: Trade-off Between Statistical Uncertainty and Cumulative Reward.

摘要

著录项

相似文献

相关主题

期刊订阅