首页> 外文会议>Machine learning(ML95) >Efficient Learning from Delayed Rewards through Symbiotic Evolution

【24h】

Efficient Learning from Delayed Rewards through Symbiotic Evolution

机译：通过共生进化从延迟奖励中进行有效学习

获取原文

获取原文并翻译 | 示例

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper presents a new reinforcement learning method called SANE (Symbiotic, Adaptive Neuro-Evolution) that evolves a population of neurons through genetic algorithms to form a neural network for a given task. Symbiotic evolution promotes both cooperation and specialization in the population, which results in a fast, efficient genetic search and discourages convergence to suboptimal solutions. In the inverted pendulum problem, SANE formed effective networks 9 to 16 times faster in CPU time than the Adaptive Heuristic Critic and 2 times faster than the GENITOR neuro-evolution approach without loss of generalization. Such efficient learning, combined with few domain assumptions, makes SANE a promising approach to a broad range of reinforcement learning problems, including many real-world applications.

机译：本文提出了一种称为SANE（共生，自适应神经进化）的新强化学习方法，该方法通过遗传算法进化神经元群体，从而形成针对给定任务的神经网络。共生进化促进了种群的合作和专业化，从而导致了快速，有效的遗传搜索，并阻碍了向次优解决方案的融合。在倒立摆问题中，SANE形成的有效网络的CPU时间比自适应启发式批评家快9到16倍，比GENITOR神经进化方法快2倍，而又不失一般性。这种有效的学习加上很少的领域假设，使SANE成为解决包括许多实际应用在内的各种强化学习问题的有前途的方法。

著录项

来源
《Machine learning(ML95) 》|1995年|p.396-404|共9页
会议地点 Tahoe City CA(US);Tahoe City CA(US)
作者
David E. Moriarty; Risto Miikkulainen;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种 eng
中图分类自动化技术、计算机技术 ;
关键词

相似文献

外文文献
中文文献
专利

1. Efficient reinforcement learning through dynamic symbiotic evolution for TSK-type fuzzy controller design [J] . CHENG-JIAN LIN, YONG-JI XU International journal of general systems . 2005 ,第5期

机译：TSK型模糊控制器设计通过动态共生演化进行有效的强化学习
2. LEARNING TO WAIT FOR MORE LIKELY OR JUST MORE: GREATER TOLERANCE TO DELAYS OF REWARD WITH INCREASINGLY LONGER DELAYS [J] . Rung Jillian M., Young Michael E. Journal of the experimental analysis of behavior . 2015 ,第1期

机译：学习更多或更多的等待：随着延迟时间的增加，对奖励延迟的容忍度更高
3. SWIRL: A sequential windowed inverse reinforcement learning algorithm for robot tasks with delayed rewards [J] . Krishnan Sanjay, Garg Animesh, Liaw Richard, The International journal of robotics research . 2019 ,第2a3期

机译：SWIRL：顺序窗口逆强化学习算法，用于延迟奖励的机器人任务
4. Efficient Learning from Delayed Rewards through Symbiotic Evolution [C] . David E. Moriarty, Risto Miikkulainen International conference on machine learning . 1995

机译：通过共生演变从延迟奖励中学习
5. Behavioral and neural evidence of incentive bias for immediate rewards relative to preference-matched delayed rewards. [D] . Luo, Shan. 2009

机译：相对于偏好匹配的延迟奖励而言，立即奖励的激励偏差的行为和神经证据。
6. Structure-Preserving Imitation Learning With Delayed Reward: An Evaluation Within the RoboCup Soccer 2D Simulation Environment [O] . Quang Dang Nguyen, Mikhail Prokopenko 2020

机译：延迟奖励的结构保留模仿学习：Robocup Soccer 2D模拟环境中的评估
7. Efficient Reinforcement Learning through Symbiotic Evolution [O] . David E. Moriarty, Risto Miikkulainen, Pack Kaelbling 1996

机译：通过共生演化有效地加强学习
8. Learning from Noisy and Delayed Rewards: The Value of Reinforcement Learning to Defense Modeling and Simulation. [R] . Alt, J. K. 2012

机译：学习嘈杂和延迟奖励：强化学习对国防建模和仿真的价值。

Efficient Learning from Delayed Rewards through Symbiotic Evolution

摘要

著录项

相似文献

相关主题

期刊订阅