首页> 外文会议>Proceedings of the 3rd European Conference on Mobile Robots >Improving reinforcement learning through a better exploration strategy and an adjustable representationof the environment
【24h】

Improving reinforcement learning through a better exploration strategy and an adjustable representationof the environment

机译:通过更好的探索策略和可调整的环境表示来改善强化学习

获取原文
获取原文并翻译 | 示例

摘要

Reinforcement learning is a promising strategy as all the robot needs to start a random search of the desired solution is a reinforcement function which specifies the main restrictions of the behaviour. Nevertheless, the robot wastes too much time trying the execution of random -mostly wrong- actions, and the user is forced to determine the balance between the exploration of new actions and the execution of already tried ones. In this context we propose a methodology which is able to achieve fast convergences towards good robot-control policies, and it determines on its own the required degree of exploration at every instant. The performance of our approach is due to the mutual and dynamic influence that three different elements exert on each other: reinforcement learning, genetic algorithms, and a dynamic representation of the environment around the robot. In this paper we describe the application of our approach to solve two common tasks in mobile robotics (wall following and door traversal). The experimental results show how the required learning time is significantly reduced and the stability of the process is increased. On the other hand, the low user-intervention required to solve both tasks -only the reinforcement function is changed-, confirms the contribution of this approach towards robot techniques that are fast, user friendly, and demand little application-specific knowledge by the user, something more and more required nowadays.
机译:强化学习是一种有前途的策略,因为机器人开始对所需解决方案进行随机搜索所需的所有功能都是强化功能,它指定了行为的主要限制。然而,机器人浪费了太多时间来尝试执行随机的-大多数是错误的动作-并且用户被迫确定探索新动作与执行已经尝试过的动作之间的平衡。在这种情况下,我们提出了一种方法,该方法能够快速朝着良好的机器人控制策略收敛,并在每个瞬间自行确定所需的探索程度。我们的方法的性能归因于三个不同元素相互施加的相互影响和动态影响:强化学习,遗传算法以及机器人周围环境的动态表示。在本文中,我们描述了该方法在解决移动机器人中的两个常见任务(墙跟随和门遍历)中的应用。实验结果表明如何显着减少所需的学习时间并提高过程的稳定性。另一方面,解决这两个任务所需的用户干预很低-仅更改了增强功能-证实了这种方法对快速,用户友好且用户几乎不需要特定于应用程序的知识的机器人技术的贡献,如今越来越需要一些东西。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号