首页> 外文期刊>IEEE Robotics and Automation Letters >Learning to Walk a Tripod Mobile Robot Using Nonlinear Soft Vibration Actuators With Entropy Adaptive Reinforcement Learning
【24h】

Learning to Walk a Tripod Mobile Robot Using Nonlinear Soft Vibration Actuators With Entropy Adaptive Reinforcement Learning

机译:学习使用具有熵自适应加固学习的非线性软振动执行器步行三脚架移动机器人

获取原文
获取原文并翻译 | 示例

摘要

Soft mobile robots have shown great potential in unstructured and confined environments by taking advantage of their excellent adaptability and high dexterity. However, there are several issues to be addressed, such as actuating speeds and controllability, in soft robots. In this letter, a new vibration actuator is proposed using the nonlinear stiffness characteristic of a hyperelastic material, which creates continuous vibration of the actuator. By integrating three proposed actuators, we also present an advanced soft mobile robot with high degrees of freedom of movement. However, since the dynamic model of the soft mobile robot is generally hard to obtain(intractable), it is difficult to design a controller for the robot. In this regard, we present a method to train a controller, using a novel reinforcement learning (RL) algorithm called adaptive soft actor-critic (ASAC). ASAC gradually reduces a parameter called an entropy temperature, which regulates the entropy of the control policy. In this way, the proposed method can narrow down the search space during training, and reduce the duration of demanding data collection processes in real-world experiments. For the verification of the robustness and the controllability of our robot and the RL algorithm, experiments for zig-zagging path tracking and obstacle avoidance were conducted, and the robot successfully finished the missions with only an hour of training time.
机译:通过利用其优异的适应性和高灵巧性,软移动机器人在非结构化和狭窄环境中表现出很大的潜力。但是,在软机器人中有几个问题,例如致动速度和可控性等问题。在这封信中,采用一种新的振动致动器,采用超铸材料的非线性刚度特性,这产生了致动器的连续振动。通过整合三个提出的执行器,我们还提供了一个具有高度自由度的先进软移动机器人。然而,由于软移动机器人的动态模型通常很难获得(难以处理),因此难以设计用于机器人的控制器。在这方面,我们使用名为Adaptive Soft Actor-批评评论家(ASAC)的新型加强学习(RL)算法来培训控制器的方法。 ASAC逐渐减少称为熵温度的参数,该参数调节控制策略的熵。以这种方式,所提出的方法可以在训练期间缩小搜索空间,并减少实际实验中苛刻的数据收集过程的持续时间。为了验证我们机器人和R1算法的鲁棒性和可控性,进行了Zig Zagging路径跟踪和避免避免的实验,并且机器人成功地完成了任务,只有一小时的训练时间。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号