首页> 外文期刊>International Journal of Intelligent Information Systems >Multiagent Cooperative Reinforcement Learning by Expert Agents (MCRLEA)
【24h】

Multiagent Cooperative Reinforcement Learning by Expert Agents (MCRLEA)

机译:专家代理(MCRLEA)的多代理协作强化学习

获取原文
           

摘要

The paper gives novel approach Multiagent Cooperative Reinforcement Learning by Expert Agents (MCRLEA) for dynamic decision making in the retail application. Furthermore, it put up different cooperation schemes for multiagent cooperative reinforcement learning i.e. EQ learning, EGroup, EDynamic, EGoal driven and Expert agents scheme. Implementation outcome includes a demonstration of recommended cooperation schemes that are competent enough to speedup the collection of agents that achieve excellent action policies. Accordingly this approach presents three retailer stores in the retail market place. Retailers can help to each other and can obtain profit from cooperation knowledge through learning their own strategies that just stand for their aims and benefit. The vendors are the knowledgeable agents in the hypothesis to employ cooperative learning to train in the circumstances. Assuming significant hypothesis on the vendor's stock policy, restock period, arrival process of the consumers, the approach is formed as Markov decision process model that makes it possible to design learning algorithms. The proposed algorithms noticeably learn dynamic consumer performance. Moreover, the paper illustrates results of Cooperative Reinforcement Learning Algorithms of three shop agents for the period of one year sale duration and then demonstrated the results using proposed approach for three shop agents for the period of one year sale duration. The results obtained by the proposed expert agent based cooperation approach show that such methods can put into a quick convergence of agents in the dynamic environment.
机译:本文为零售应用中的动态决策提供了一种新颖的方法,即由专家代理(MCRLEA)进行多代理协作强化学习。此外,它针对多主体协作强化学习提出了不同的协作方案,即情商学习,EGroup,EDynamic,EGoal驱动和专家代理方案。实施结果包括对建议合作计划的演示,该计划具有足够的能力,可以加快实现出色行动政策的代理商的收集。因此,该方法在零售市场上展示了三个零售商店。零售商可以互相帮助,并可以通过学习仅代表目标和利益的策略来从合作知识中获利。卖方是假设中的知识渊博的代理商,可以利用合作学习来对情况进行培训。假设关于供应商的库存策略,补货周期,消费者的到达过程的重要假设,则该方法形成为马尔可夫决策过程模型,从而可以设计学习算法。所提出的算法明显地学习了动态消费者性能。此外,本文举例说明了三个商店代理商在一年销售期限内的合作强化学习算法的结果,然后使用建议的方法对三个商店代理商在一年销售期限内的结果进行了演示。所提出的基于专家智能体的合作方法获得的结果表明,这种方法可以使动态环境中的智能体快速收敛。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号