EM-based policy hyper parameter exploration: application to standing and balancing of a two-wheeled smartphone robot

Jiexin Wang; Eiji Uchibe; Kenji Doya

首页> 外文期刊>Artificial life and robotics >EM-based policy hyper parameter exploration: application to standing and balancing of a two-wheeled smartphone robot

【24h】

EM-based policy hyper parameter exploration: application to standing and balancing of a two-wheeled smartphone robot

机译：基于EM的策略超参数探索：在两轮智能手机机器人站立和平衡中的应用

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

This paper proposes a novel policy search algorithm called EM-based Policy Hyper Parameter Exploration (EPHE) which integrates two reinforcement learning algorithms: Policy Gradient with Parameter Exploration (PGPE) and EM-based Reward-Weighted Regression. Like PGPE, EPHE evaluates a deterministic policy in each episode with the policy parameters sampled from a prior distribution given by the policy hyper parameters (mean and variance). Based on EM-based Reward-Weighted Regression, the policy hyper parameters are updated by reward-weighted averaging so that gradient calculation and tuning of the learning rate are not required. The proposed method is tested in the benchmarks of pendulum swing-up task, cart-pole balancing task and simulation of standing and balancing of a two-wheeled smartphone robot. Experimental results show that EPHE can achieve efficient learning without learning rate tuning even for a task with discontinuities.

机译：本文提出了一种新的策略搜索算法，称为基于EM的策略超参数探索（EPHE），该算法集成了两种强化学习算法：带参数探索的策略梯度（PGPE）和基于EM的奖励加权回归。像PGPE一样，EPHE使用从策略超参数（均值和方差）给出的先验分布中采样的策略参数来评估每个情节中的确定性策略。基于基于EM的奖励加权回归，通过奖励加权平均更新策略超参数，因此不需要梯度计算和学习率调整。该方法在摆摆任务，车杆平衡任务以及两轮智能手机机器人站立和平衡模拟中进行了测试。实验结果表明，即使对于不连续的任务，EPHE无需调整学习速度即可实现高效学习。

著录项

来源
《Artificial life and robotics》 |2016年第1期|125-131|共7页
作者
Jiexin Wang; Eiji Uchibe; Kenji Doya;
展开▼
作者单位

Kyoto University, Kyoto, Japan;

Okinawa Institute of Science and Technology, Okinawa, Japan;

Kyoto University, Kyoto, Japan,Okinawa Institute of Science and Technology, Okinawa, Japan;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
EM-based policy search; Reinforcement learning; Non-linear motor control; Smartphone robot;

机译：基于EM的策略搜索;强化学习;非线性电机控制;智能手机机器人;

相似文献

外文文献
中文文献
专利

1. Adaptive Baseline Enhances EM-Based Policy Search: Validation in a View-Based Positioning Task of a Smartphone Balancer [J] . Wang, Jiexin Frontiers in Neurorobotics . 2017,第2009期

机译：自适应基准增强了基于EM的策略搜索：在智能手机平衡器的基于视图的定位任务中进行验证
2. Position-Tracking Controller for Two-Wheeled Balancing Robot Applications Using Invariant Dynamic Surface [J] . Seok-Kyoon Kim, Choon Ki Ahn, Ramesh K. Agarwal IEEE Transactions on Systems, Man, and Cybernetics . 2021,第2期

机译：使用不变动态表面的两轮平衡机器人应用的位置跟踪控制器
3. Robust tracking control design and its application to balance a two-wheeled robot steering on a bumpy road [J] . Kuo-Ho Su Proceedings of the Institution of Mechanical Engineers . 2012,第I7期

机译：稳健的跟踪控制设计及其在崎road不平的道路上平衡两轮机器人转向的应用
4. Smartphone-based two-wheeled self-balancing vehicles rider assistant [C] . Smirnov Alexander, Kashevnik Alexey, Lashkov Igor, Conference of Open Innovations Association . 2015

机译：基于智能手机的两轮自平衡车骑士助手
5. Theory and applications of hyper-redundant robotic manipulators. [D] . Chirikjian, Gregory Scott. 1992

机译：超冗余机器人操纵器的理论与应用。
6. Adaptive Baseline Enhances EM-Based Policy Search: Validation in a View-Based Positioning Task of a Smartphone Balancer [O] . Jiexin Wang, Eiji Uchibe, Kenji Doya 2017

机译：自适应基准增强了基于EM的策略搜索：在智能手机平衡器的基于视图的定位任务中进行验证
7. EM-based policy hyper parameter exploration: application to standing and balancing of a two-wheeled smartphone robot [O] . 2016

机译：基于EM的策略超参数探索：在两轮智能手机机器人站立和平衡中的应用

EM-based policy hyper parameter exploration: application to standing and balancing of a two-wheeled smartphone robot

摘要

著录项

相似文献

相关主题

期刊订阅