Collaborative learning of stochastic bandits over a social network

机译：通过社交网络协作学习随机土匪

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We consider a collaborative online learning paradigm, wherein a group of agents connected through a social network are engaged in playing a stochastic multi-armed bandit game. Each time an agent takes an action, the corresponding reward is instantaneously observed by the agent, as well as its neighbours in the social network. We perform a regret analysis of various policies in this collaborative learning setting. A key finding of this paper is that natural extensions of widely-studied single agent learning policies to the network setting need not perform well in terms of regret. In particular, we identify a class of non-altruistic and individually consistent policies, and argue by deriving regret lower bounds that they are liable to suffer a large regret in the networked setting. We also show that the learning performance can be substantially improved if the agents exploit the structure of the network, and develop a simple learning algorithm based on dominating sets of the network. Specifically, we first consider a star network, which is a common motif in hierarchical social networks, and show analytically that the hub agent can be used as an information sink to expedite learning and improve the overall regret. We also derive network-wide regret bounds for the algorithm applied to general networks. We conduct numerical experiments on a variety of networks to corroborate our analytical results.

机译：我们考虑了一种协作式在线学习范例，其中通过社交网络连接的一组特工参与了随机多臂强盗游戏。每当代理人采取行动时，代理人及其在社交网络中的邻居都会立即观察到相应的奖励。在这种协作式学习环境中，我们对各种政策进行了遗憾的分析。本文的一个关键发现是，就遗憾而论，广为研究的单一代理学习策略向网络设置的自然扩展不需要表现良好。特别是，我们确定了一类非利他性和个体一致性的策略，并通过推导遗憾的下界来争论它们在网络环境中容易遭受很大的遗憾。我们还表明，如果代理利用网络的结构，并基于网络的主导集开发一种简单的学习算法，则学习性能可以得到显着提高。具体来说，我们首先考虑星型网络，它是分层社交网络中的常见主题，并且通过分析表明，中心代理可以用作信息接收器，以加快学习速度并改善总体遗憾。我们还推导了适用于一般网络的算法在整个网络范围内的后悔界限。我们在各种网络上进行数值实验，以证实我们的分析结果。

著录项

来源
《Annual Allerton Conference on Communication, Control, and Computing》|2016年|1228-1235|共8页
会议地点
作者
Ravi Kumar Kolla; Krishna Jagannathan; Aditya Gopalan;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Social network services; Collaboration; Upper bound; Stochastic processes; Collaborative work; Servers; Electronic mail;

机译：社交网络服务;协作;上限;随机过程;协作工作;服务器;电子邮件;

相似文献

外文文献
中文文献
专利

1. Collaborative Learning of Stochastic Bandits Over a Social Network [J] . Ravi Kumar Kolla, Krishna Jagannathan, Aditya Gopalan IEEE/ACM Transactions on Networking . 2018,第4期

机译：社会网络上的随机土匪合作学习
2. Multi-Armed Bandit Learning for Cache Content Placement in Vehicular Social Networks [J] . Bitaghsir Saeid Akhavan, Dadlani Aresh, Borhani Muhammad, IEEE communications letters . 2019,第12期

机译：用于车辆社交网络中缓存内容放置的多武装强盗学习
3. Investigating patterns of interaction in networked learning and computer-supported collaborative learning: A role for Social Network Analysis [J] . Maarten de Laat, Vic Lally, Lasse Lipponen, International Journal of Computer-Supported Collaborative Learning . 2007,第1期

机译：网络学习和计算机支持的协作学习中的交互研究模式：社交网络分析的作用
4. Collaborative learning of stochastic bandits over a social network [C] . Ravi Kumar Kolla, Krishna Jagannathan, Aditya Gopalan Annual Allerton Conference on Communication, Control, and Computing . 2016

机译：社交网络中随机匪徒的协作学习
5. Efficient Learning in Stochastic Bandits [D] . Yu, Xiaotian. 2019

机译：随机匪徒高效学习
6. Nash Equilibrium of Social-Learning Agents in a Restless Multiarmed Bandit Game [O] . Kazuaki Nakayama, Masato Hisakado, Shintaro Mori -1

机译：躁动多臂强盗游戏中的社会学习代理人的纳什均衡
7. Collaborative Learning of Stochastic Bandits Over a Social Network [O] . Ravi Kumar Kolla, Krishna Jagannathan, Aditya Gopalan 2018

机译：社交网络中随机匪徒的协作学习

Collaborative learning of stochastic bandits over a social network

摘要

著录项

相似文献

相关主题

期刊订阅