Learning Robot Arm Controls Using Augmented Random Search in Simulated Environments

机译：学习机器人ARM控制在模拟环境中使用增强随机搜索

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

We investigate the learning of continuous action policy for controlling a six-axes robot arm. Traditional tabular Q-Learning can handle discrete actions well but less so for continuous actions since the tabular approach is constrained by the size of the state-value table. Recent advances in deep reinforcement learning and policy gradient learning abstract the look-up table using function approximators such as artificial neural networks. Artificial neural networks abstract loop-up policy tables as policy networks that can predict discrete actions as well as continuous actions. However, deep reinforcement learning and policy gradient learning were criticized for their complexity. It was reported in recent works that Augmented Random Search (ARS) has a better sample efficiency and a simpler hyper-parameter tuning. This motivates us to apply the technique to our robot-arm reaching tasks. We constructed a custom simulated robot arm environment using Unity Machine Learning Agents game engine, then designed three robot-arm reaching tasks. Twelve models were trained using ARS techniques. Another four models were trained using the state-of-the-art PG learning technique i.e., proximal policy optimization (PPO). Results from models trained using PPO provide a baseline from the policy gradient technique. Empirical results of models trained using ARS and PPO were analyzed and discussed.

机译：我们调查了控制六轴机器人手臂的连续行动政策的学习。传统的表格Q-Learning可以很好地处理离散的动作，但由于表格方法受到状态值表的大小来限制。深增强学习和政策梯度学习摘要摘要使用人工神经网络等函数逼近的查找表。人工神经网络抽象循环策略表作为可以预测离散动作以及连续动作的策略网络。然而，深增强学习和政策梯度学习因其复杂性而受到批评。据报道，最近的作品增强随机搜索（ARS）具有更好的采样效率和更简单的超参数调整。这使我们能够将技术应用于我们的机器人臂到达任务。我们使用Unity机器学习代理游戏引擎构建了定制的模拟机器人臂环境，然后设计了三个机器人臂到达任务。使用ARS技术训练了十二型号。使用最先进的PG学习技术训练另外四种模型，即近端策略优化（PPO）。使用PPO培训的模型的结果提供了来自政策梯度技术的基线。分析并讨论了使用ARS和PPO培训的模型的经验结果。

著录项

来源
《Multi-disciplinary International Conference on Artificial Intelligence》|2021年|118-128|共11页
会议地点
作者
Somnuk Phon-Amnuaisuk; Peter David Shannon; Saiful Omar;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Augmented Random Search; Robot arm controls; Reinforcement learning;

机译：增强随机搜索;机器人手臂控制;加强学习;

相似文献

外文文献
中文文献
专利

1. Virtual reality robotic surgery warm-up improves task performance in a dry laboratory environment: A prospective randomized controlled study [J] . LendvayT.S., BrandT.C., WhiteL., Journal of the American College of Surgeons . 2013,第6期

机译：虚拟现实机器人手术预热可改善实验室干燥环境中的任务性能：一项前瞻性随机对照研究
2. Robot-supported upper limb training in a virtual learning environment : a pilot randomized controlled trial in persons with MS [J] . Peter Feys, Karin Coninx, Lore Kerkhofs, Journal of NeuroEngineering Rehabilitation . 2015,第1期

机译：虚拟学习环境中机器人支持的上肢训练：MS患者的随机对照试验
3. Robot-supported upper limb training in a virtual learning environment : a pilot randomized controlled trial in persons with MS [J] . Peter Feys, Karin Coninx, Lore Kerkhofs, Journal of NeuroEngineering Rehabilitation . 2015,第1期

机译：虚拟学习环境中机器人支持的上肢训练：MS患者的随机对照试验
4. A Validation Approach for Deep Reinforcement Learning of a Robotic Arm in a 3D Simulated Environment [C] . Monica Gruosso, Nicola Capece, Ugo Erra, IEEE World Symposium on Applied Machine Intelligence and Informatics . 2021

机译：3D模拟环境中机器人臂的深度增强学习的验证方法
5. Learning Control of Robotic Arm Using Deep Q-Neural Network [D] . Mellatshahi, Seyed Navid. 2021

机译：深度Q-神经网络学习控制机器人手臂
6. Virtual Reality Robotic Surgery Warm-Up Improves Task Performance in a Dry Lab Environment: A Prospective Randomized Controlled Study [O] . Thomas S. Lendvay, Timothy C. Brand, Lee White, -1

机译：虚拟现实机器人手术热身改善了干燥实验室环境中的任务性能：一项前瞻性随机对照研究
7. Virtual Reality Robotic Surgery Warm-Up Improves Task Performance in a Dry Laboratory Environment: A Prospective Randomized Controlled Study [O] . Thomas S. Lendvay, Timothy C. Brand, Lee White, 2013

机译：虚拟现实机器人手术预热可提高干燥实验室环境中的任务性能：预期随机对照研究

Learning Robot Arm Controls Using Augmented Random Search in Simulated Environments

摘要

著录项

相似文献

相关主题

期刊订阅