Tug-of-War Model for Multi-armed Bandit Problem

机译：多武装强盗问题的拔河模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We propose a model - the "tug-of-war (TOW) model" - to conduct unique parallel searches using many nonlocally correlated search agents. The model is based on the property of a single-celled amoeba, the true slime mold Physarum, which maintains a constant intracellular resource volume while collecting environmental information by concurrently expanding and shrinking its branches. The conservation law entails a "nonlocal correlation" among the branches, i.e., volume increment in one branch is immediately compensated by volume decrement(s) in the other branch(es). This nonlocal correlation was shown to be useful for decision making in the case of a dilemma. The multi-armed bandit problem is to determine the optimal strategy for maximizing the total reward sum with incompatible demands. Our model can efficiently manage this "exploration-exploitation dilemma" and exhibits good performances. The average accuracy rate of our model is higher than those of well-known algorithms such as the modified e-greedy algorithm and modified softmax algorithm.

机译：我们提出了一种模型 - “瓦夫 - 战争（拖曳）模型” - 使用许多非相互关联的搜索代理进行独特的并行搜索。该模型基于单细胞AmoEBA的性质，真正的粘液模具生物体，通过同时扩张和缩小其分支，在收集环境信息的同时保持恒定的细胞内资源体积。保护法在分支中需要“非局部相关性”，即一个分支中的体积增量由另一个分支中的体积减量立即补偿。该非局部相关性被证明是在困境的情况下的决策中有用。多武装强盗问题是确定最佳策略，以最大化具有不兼容的需求的总奖励和。我们的模型可以有效地管理此“勘探开发困境”并表现出良好的表现。我们模型的平均精度率高于众所周知的算法，例如修改的电子贪婪算法和修改的Softmax算法。

著录项

来源
《International Conference on Unconventional Computation》|2010年||共12页
会议地点
作者
Song-Ju Kim; Masashi Aono; Masahiko Hara;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP3-53;
关键词
Multi-armed bandit problem; Reinforcement learning; Bio-inspired computation; Amoeba-based computing;

机译：多武装匪徒问题;加固学习;生物启发计算;amoeba的计算;

相似文献

外文文献
中文文献
专利

1. Hyper-heuristics using multi-armed bandit models for multi-objective optimization [J] . Almeida Carolina P., Goncalves Richard A., Venske Sandra, Applied Soft Computing . 2020,第1期

机译：利用多武装强盗模型进行多目标优化的超高兴
2. On the Complexity of Best-Arm Identification in Multi-Armed Bandit Models [J] . Emilie Kaufmann, Olivier Capp??, Aur??lien Garivier Journal of machine learning research . 2016,第1期

机译：多臂强盗模型中最佳武器识别的复杂性
3. Multi-armed Bandit Models for the Optimal Design of Clinical Trials: Benefits and Challenges [J] . Villar Sofia S., Bowden Jack, Wason James Statistical science . 2015,第2期

机译：用于临床试验优化设计的多臂Bandit模型：好处和挑战
4. Tug-of-War Model for Multi-armed Bandit Problem [C] . Song-Ju Kim, Masashi Aono, Masahiko Hara Unconventional computation . 2010

机译：多臂强盗问题的拔河模型
5. Optimization of the Multi-armed Bandit Problem with Graphical Models: a Bayesian Perspective [D] . 趙辰 2019

机译：用图形模型优化多臂匪问题：贝叶斯观点
6. Multi-armed Bandit Models for the Optimal Design of Clinical Trials: Benefits and Challenges [O] . Sofía S. Villar, Jack Bowden, James Wason -1

机译：用于临床试验优化设计的多臂Bandit模型：好处和挑战
7. Player Modeling via Multi-Armed Bandits [O] . Robert C. Gray, Jichen Zhu, Danielle Arigo, 2020

机译：通过多武装匪徒建模

Tug-of-War Model for Multi-armed Bandit Problem

摘要

著录项

相似文献

相关主题

期刊订阅