首页> 外文会议>IEEE Pacific Visualization Symposium >DynamicsExplorer: Visual Analytics for Robot Control Tasks involving Dynamics and LSTM-based Control Policies
【24h】

DynamicsExplorer: Visual Analytics for Robot Control Tasks involving Dynamics and LSTM-based Control Policies

机译:DynamicSexplorer:用于机器人控制任务的可视化分析涉及动态和基于LSTM的控制策略

获取原文

摘要

Deep reinforcement learning (RL), where a policy represented by a deep neural network is trained, has shown some success in playing video games and chess. However, applying RL to real-world tasks like robot control is still challenging. Because generating a massive number of samples to train control policies using RL on real robots is very expensive, hence impractical, it is common to train in simulations, and then transfer to real environments. The trained policy, however, may fail in the real world because of the difference between the training and the real environments, especially the difference in dynamics. To diagnose the problems, it is crucial for experts to understand (1) how the trained policy behaves under different dynamics settings, (2) which part of the policy affects the behaviors the most when the dynamics setting changes, and (3) how to adjust the training procedure to make the policy robust.This paper presents DynamicsExplorer, a visual analytics tool to diagnose the trained policy on robot control tasks under different dynamics settings. DynamicsExplorer allows experts to overview the results of multiple tests with different dynamics-related parameter settings so experts can visually detect failures and analyze the sensitivity of different parameters. Experts can further examine the internal activations of the policy for selected tests and compare the activations between success and failure tests. Such comparisons help experts form hypotheses about the policy and allows them to verify the hypotheses via DynamicsExplorer. Multiple use cases are presented to demonstrate the utility of DynamicsExplorer.
机译:深度加强学习(RL),受到深层神经网络代表的策略培训,在玩电子游戏和国际象棋时已经取得了一些成功。但是,将RL应用于真实世界的任务,如机器人控制仍然具有挑战性。由于生成大量的样本来使用R1在真正的机器人上培训控制策略非常昂贵,因此是不切实际的,它是常见的,它在模拟中训练,然后转移到真实环境。然而,训练有素的政策可能在现实世界中失败,因为培训和真实环境之间的差异,尤其是动态的差异。要诊断问题,谅解措施是至关重要的(1)如何在不同的动态设置下训练的策略在不同的动态设置下行为,(2)该策略的哪个部分影响动态设置变化的行为最多,以及(3)如何调整培训程序以使策略稳健。此文件显示DynamicSexplorer,一种可视化分析工具,可在不同的动态设置下诊断有关机器人控制任务的训练策略。 DynamicSexplorer允许专家概述具有不同动力学相关参数设置的多个测试的结果,因此专家可以在视觉上检测失败并分析不同参数的灵敏度。专家可以进一步检查所选测试策略的内部激活,并比较成功和故障测试之间的激活。此类比较帮助专家对策略进行假设,并允许他们通过DynamicSexplorer验证假设。提出了多种用例来演示DynamicSexplorer的效用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号