首页> 外文会议>World multiconference on systemics, cybernetics and informatics;SCI 2000 >A study on the Convergence of different Learning Strategies in a Multi-teacher Environment
【24h】

A study on the Convergence of different Learning Strategies in a Multi-teacher Environment

机译:多老师环境下不同学习策略的融合研究

获取原文

摘要

Learning strategy in a Multiteacher environment is not unique. Multiteacher environment provides an opportunity to learn different strategies. By processing the environmental response in different ways, different strategies can be learned. Three major learning strategies are, to find the action with the least weighted average penalty (LWAP), to find the action which has the maximum probability of collectively getting the majority approval (MPMA) and to find the action which maximum number of teachers independently agree is the best (MNTA). All of the learning algorithms expect MNTA are known to be absolutely expedient and ε-optimal in a stationary environment. In this paper, the rate of convergence and the accuracy of convergence of the three learning strategies are studied. For simulation a seven action set in a five teacher environment is considered.
机译:Multiteacher环境中的学习策略不是唯一的。多老师环境提供了学习不同策略的机会。通过以不同的方式处理环境响应,可以学习不同的策略。三个主要的学习策略是,找到加权平均罚分最小的动作(LWAP),找到具有集体获得多数批准的最大可能性(MPMA)的动作,以及找到最大数量的教师独立同意的动作是最好的(MNTA)。所有学习算法都期望MNTA在平稳环境中绝对权宜且ε最优。本文研究了三种学习策略的收敛速度和收敛精度。对于模拟,考虑在五位教师环境中设置七个动作。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号