在线学习方法综述:汤普森抽样和其他方法

何斯迈; 金羽佳; 王华; 葛冬冬

首页> 中文期刊> 《运筹学学报》 >在线学习方法综述:汤普森抽样和其他方法

在线学习方法综述:汤普森抽样和其他方法

开具论文收录证明 >>

期刊封面封底目录下载 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

本文尝试对在线学习领域的最新研究成果、相关主要理论和算法进行综述.在线学习的内容非常广博,本文希望能够为读者介绍其中一些基本的算法和想法,从最经典的理论模型和算法设计开始,对在线学习的发展情况作一个一般性的介绍.首先,以经典的在线优化模型——多摇臂赌博机问题为例,引入了汤普森抽样算法和信心上界算法,分析、展示了它们的基本思路和最新成果,并进一步讨论了汤普森抽样算法在更复杂的在线学习问题中的变式和应用.本文同时对在线凸优化算法做了初步探讨,它也是解决多摇臂赌博机问题和其他许多在线学习的应用问题时一种强有力的工具.%The paper is a survey on the latest research results,major theories and algorithms in the field of online learning.The topic of online learning is a broad one,and we aim at introducing the principles of the basic algorithms and ideas to the readers.We start from the most standard models and algorithm design,and extend all the way to a more general presentation on the latest developments in the area.To begin with,we take the standard online optimization model,the multi-armed bandit problem,as an example.Then we discuss Thompson sampling algorithms and upper confidence bound algorithms,analyzing and presenting the main idea and newest theoretical achievements,with further discussion about the extensions and applications of Thompson sampling in some more complicated real-world online learning scenarios.Furthermore,the paper gives a brief introduction about online convex optimization,which serves as an effective and well-known framework in solving multi-armed bandit problem and other application problems.

著录项

来源
《运筹学学报》 |2017年第4期|84-102|共19页
作者
何斯迈; 金羽佳; 王华; 葛冬冬;
展开▼
作者单位

上海财经大学信息管理与工程学院,上海200433;

复旦大学数学科学学院,上海200433;

复旦大学数学科学学院,上海200433;

上海财经大学交叉科学研究院,上海200433;

展开▼
原文格式 PDF
正文语种 chi
中图分类
关键词
在线学习; 多摇臂赌博机; 汤普森抽样; 信心上界算法; 情境多摇臂赌博机; 在线凸优化;

相似文献

中文文献
外文文献
专利

1. 在线加热冷榨法与其他方法生产山茶籽油的脂肪酸组成研究 [J] . 韦社生 ,王衍彬 ,刘本同 . 食品工程 . 2013,第003期
2. 面向多类别分类问题的子抽样主动学习方法 [J] . 施伟 ,黄红蓝 ,冯旸赫 . 系统工程与电子技术 . 2021,第003期
3. 最大化最小margin的抽样多样性集成学习方法研究 [J] . 周钢 ,郭福亮 . 计算机应用与软件 . 2020,第008期
4. 一种改进的雷达辐射源识别在线学习方法 [J] . 赵旭鸽 ,李蒙 . 信息技术与信息化 . 2021,第001期
5. 一种改进的雷达辐射源识别在线学习方法 [J] . 赵旭鸽 ,李蒙 . 信息技术与信息化 . 2021,第001期
6. 大规模分类任务的分层学习方法综述 [C] . 胡清华 ,王煜 ,周玉灿 . 人工智能领域青年学者研讨会 . 2017
7. 基于Encoder-Decoder深度学习方法的路网车流动态OD在线估计 [A] . 郝一行 . 2020

在线学习方法综述:汤普森抽样和其他方法

摘要

著录项

相似文献

相关主题

期刊订阅