首页> 外文会议>Machine learning(ML95) >Efficient Learning from Delayed Rewards through Symbiotic Evolution
【24h】

Efficient Learning from Delayed Rewards through Symbiotic Evolution

机译:通过共生进化从延迟奖励中进行有效学习

获取原文
获取原文并翻译 | 示例

摘要

This paper presents a new reinforcement learning method called SANE (Symbiotic, Adaptive Neuro-Evolution) that evolves a population of neurons through genetic algorithms to form a neural network for a given task. Symbiotic evolution promotes both cooperation and specialization in the population, which results in a fast, efficient genetic search and discourages convergence to suboptimal solutions. In the inverted pendulum problem, SANE formed effective networks 9 to 16 times faster in CPU time than the Adaptive Heuristic Critic and 2 times faster than the GENITOR neuro-evolution approach without loss of generalization. Such efficient learning, combined with few domain assumptions, makes SANE a promising approach to a broad range of reinforcement learning problems, including many real-world applications.
机译:本文提出了一种称为SANE(共生,自适应神经进化)的新强化学习方法,该方法通过遗传算法进化神经元群体,从而形成针对给定任务的神经网络。共生进化促进了种群的合作和专业化,从而导致了快速,有效的遗传搜索,并阻碍了向次优解决方案的融合。在倒立摆问题中,SANE形成的有效网络的CPU时间比自适应启发式批评家快9到16倍,比GENITOR神经进化方法快2倍,而又不失一般性。这种有效的学习加上很少的领域假设,使SANE成为解决包括许多实际应用在内的各种强化学习问题的有前途的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号