...
首页> 外文期刊>Neurocomputing >Softmax exploration strategies for multiobjective reinforcement learning
【24h】

Softmax exploration strategies for multiobjective reinforcement learning

机译:用于多目标强化学习的Softmax探索策略

获取原文
获取原文并翻译 | 示例
           

摘要

Despite growing interest over recent years in applying reinforcement learning to multiobjective problems, there has been little research into the applicability and effectiveness of exploration strategies within the multiobjective context. This work considers several widely-used approaches to exploration from the single-objective reinforcement learning literature, and examines their incorporation into multiobjective Q-learning. In particular this paper proposes two novel approaches which extend the softmax operator to work with vector-valued rewards. The performance of these exploration strategies is evaluated across a set of benchmark environments. Issues arising from the multiobjective formulation of these benchmarks which impact on the performance of the exploration strategies are identified. It is shown that of the techniques considered, the combination of the novel softmax-epsilon exploration with optimistic initialisation provides the most effective trade-off between exploration and exploitation. (C) 2017 Elsevier B.V. All rights reserved.
机译:尽管近年来对将强化学习应用于多目标问题的兴趣日益浓厚,但对于多目标环境下探索策略的适用性和有效性的研究很少。这项工作考虑了从单目标强化学习文献中探索的几种广泛使用的方法,并研究了它们在多目标Q学习中的应用。特别是,本文提出了两种新颖的方法,这些方法扩展了softmax运算符以使用矢量值奖励。这些探索策略的性能在一组基准环境中进行评估。确定了这些基准的多目标制定所产生的问题,这些问题会影响勘探策略的性能。结果表明,在所考虑的技术中,新型softmax-ε勘探与乐观初始化的结合提供了勘探与开发之间最有效的权衡。 (C)2017 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号