...
首页> 外文期刊>Journal of software >Accelerated -Greedy Multi Armed Bandit Algorithm for Online Sequential-Selections Applications
【24h】

Accelerated -Greedy Multi Armed Bandit Algorithm for Online Sequential-Selections Applications

机译:在线顺序选择应用中的加速贪婪多武装强盗算法

获取原文

摘要

Current algorithms for solving multi-armed bandit (MAB) problem in stationary observations often perform well. Although this performance may be acceptable with accurate parameter settings, most of them degrade under non stationary observations. We setup an incremental ε-greedy model with stochastic mean equation as its action-value function which is more applicable to real-world problems. Unlike the iterative algorithms suffering from step size dependency, we propose an adaptive step-size model (ASM) to introduce adaptive MAB algorithm. The proposed model employs ε-greedy approach as action selection policy. In addition, a dynamic exploration parameter ε is introduced to be ineffective by increasing decision maker’s intelligence. The proposed model is empirically evaluated and compared with existing algorithms including the standard ε-greedy, Softmax, ε-decreasing and UCB-Tuned models under stationary as well as non stationary situations. ASM not only addresses concerns in parameter dependency problem but also performs either comparable or better than mentioned algorithms. Applying these enhancements to the standard ε-greedy reduce the learning time which is more attractive to the wide range of on-line sequential selection-based applications such as autonomous agents, adaptive control, industrial robots and forecasting trend problems in management and economics domains.
机译:解决固定观测中的多臂匪(MAB)问题的当前算法通常表现良好。尽管此性能对于准确的参数设置来说是可以接受的,但是它们中的大多数在非平稳观测下会降低。我们建立了一个带有随机均值方程作为其作用值函数的增量ε贪婪模型,该模型更适用于实际问题。与受步长依赖性影响的迭代算法不同,我们提出了一种自适应步长模型(ASM)来介绍自适应MAB算法。该模型采用ε-贪婪方法作为行动选择策略。此外,动态决策参数ε会因增加决策者的智能而无效。经验模型对提出的模型进行了评估,并与现有算法(包括标准ε贪心,Softmax,ε递减和UCB调整模型)在固定和非固定情况下进行了比较。 ASM不仅解决了参数依赖性问题,而且执行了与上述算法相当或更好的算法。将这些增强功能应用于标准ε贪心可减少学习时间,这对于基于在线顺序选择的各种应用(例如自治代理,自适应控制,工业机器人以及预测管理和经济学领域的趋势问题)更具吸引力。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号