Secure Best Arm Identification in Multi-armed Bandits

机译：在多武装匪徒的安全最佳武器识别

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The stochastic multi-armed bandit is a classical decision making model, where an agent repeatedly chooses an action (pull a bandit arm) and the environment responds with a stochastic outcome (reward) coming from an unknown distribution associated with the chosen action. A popular objective for the agent is that of identifying the arm with the maximum expected reward, also known as the best-arm identification problem. We address the inherent privacy concerns that occur in a best-arm identification problem when outsourcing the data and computations to a honest-but-curious cloud. Our main contribution is a distributed protocol that computes the best arm while guaranteeing that (ⅰ) no cloud node can learn at the same time information about the rewards and about the arms ranking, and (ⅱ) by analyzing the messages communicated between the different cloud nodes, no information can be learned about the rewards or about the ranking. In other words, the two properties ensure that the protocol has no security single point of failure. We rely on the partially homomorphic property of the well-known Paillier's cryptosystem as a building block in our protocol. We prove the correctness of our protocol and we present proof-of-concept experiments suggesting its practical feasibility.

机译：随机多武装强盗是一种经典决策模型，其中代理重复选择动作（拉动臂），并且环境响应来自与所选动作相关的未知分布的随机结果（奖励）。代理人的流行目标是识别具有最大预期奖励的手臂，也称为最佳武器识别问题。在将数据和计算外包给诚实但好奇的云端时，我们解决了在最佳武装识别问题中发生的固有隐私问题。我们的主要贡献是一种分布式协议，可以计算最佳手臂，同时保证（Ⅰ）云节点可以通过分析不同云之间传递的消息，并通过分析传播的消息来在奖励和臂排名的同时学习信息。节点，无法了解奖励或关于排名的信息。换句话说，两个属性确保协议没有安全单个故障。我们依赖着名的Paillier密码系统的部分同性恋属性作为我们协议的构建块。我们证明了我们协议的正确性，我们提出了概念证明实验，表明其实际可行性。

著录项

来源
《International conference on information security practice and experience》|2019年|xiii 490 p.|共20页
会议地点
作者
Radu Ciucanu; Pascal Lafourcade; Marius Lombard-Platet; Marta Soare;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类安全保密;
关键词
Multi-armed bandits; Best arm identification; Privacy; Distributed computation; Paillier cryptosystem;

机译：多武装匪徒;最好的ARM识别;隐私;分布式计算;Paillier密码系统;

相似文献

外文文献
中文文献
专利

1. Best arm identification in multi-armed bandits with delayed feedback [J] . Aditya Grover, Todor Markov, Peter Attia, JMLR: Workshop and Conference Proceedings . 2018,第3期

机译：具有延迟反馈的多臂匪的最佳臂识别
2. On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models [J] . Emilie Kaufmann, Olivier Capp??, Aur??lien Garivier Journal of machine learning research . 2016,第1期

机译：多臂强盗模型中最佳武器识别的复杂性
3. ON THE IDENTIFICATION AND MITIGATION OF WEAKNESSES IN THE KNOWLEDGE GRADIENT POLICY FOR MULTI-ARMED BANDITS [J] . Edwards James, Fearnhead Paul, Glazebrook Kevin Probability in the Engineering and Informational Sciences . 2017,第2期

机译：多武装匪徒知识梯度策略中的弱点识别与缓解
4. Secure Best Arm Identification in Multi-armed Bandits [C] . Radu Ciucanu, Pascal Lafourcade, Marius Lombard-Platet, International conference on information security practice and experience . 2019

机译：确保多臂匪徒的最佳武器识别
5. Essays on sequential analysis: Multi-armed bandit with availability constraints and sequential change detection and identification. [D] . Yamazaki, Kazutoshi. 2009

机译：关于顺序分析的文章：具有可用性约束以及顺序更改检测和识别的多臂匪。
6. Smoking and the bandit: A preliminary study of smoker and non-smoker differences in exploratory behavior measured with a multi-armed bandit task [O] . Merideth A. Addicott, John M. Pearson, Jessica Wilson, -1

机译：吸烟和强盗：用多武装强盗任务测量的探索性行为的吸烟者和非吸烟者差异的初步研究
7. On Sequential Elimination Algorithms for Best-Arm Identification in Multi-Armed Bandits [O] . Shahrampour, Shahin, Noshad, Mohammad, Tarokh, Vahid 2017

机译：关于最佳臂识别的序贯消除算法多武装强盗

Secure Best Arm Identification in Multi-armed Bandits

摘要

著录项

相似文献

相关主题

期刊订阅