首页> 外国专利> LEARNING TO SCHEDULE CONTROL FRAGMENTS FOR PHYSICS-BASED CHARACTER SIMULATION AND ROBOTS USING DEEP Q-LEARNING

LEARNING TO SCHEDULE CONTROL FRAGMENTS FOR PHYSICS-BASED CHARACTER SIMULATION AND ROBOTS USING DEEP Q-LEARNING

机译：基于深度Q学习的基于物理的角色仿真和机器人的计划控制片段学习

页面导航

摘要
著录项
相似文献

摘要

The disclosure provides an approach for learning to schedule control fragments for physics-based virtual character simulations and physical robot control. Given precomputed tracking controllers, a simulation application segments the controllers into control fragments and learns a scheduler that selects control fragments at runtime to accomplish a task. In one embodiment, each scheduler may be modeled with a Q-network that maps a high-level representation of the state of the simulation to a control fragment for execution. In such a case, the deep Q-learning algorithm applied to learn the Q-network schedulers may be adapted to use a reward function that prefers the original controller sequence and an exploration strategy that gives more chance to in-sequence control fragments than to out-of-sequence control fragments. Such a modified Q-learning algorithm learns schedulers that are capable of following the original controller sequence most of the time while selecting out-of-sequence control fragments when necessary.

机译：本公开提供了一种用于学习调度控制片段以用于基于物理的虚拟角色模拟和物理机器人控制的方法。给定预先计算的跟踪控制器，模拟应用程序会将控制器分为控制片段，并学习一个调度程序，该调度程序在运行时选择控制片段以完成任务。在一个实施例中，可以用Q网络对每个调度器建模，该Q网络将模拟状态的高级表示映射到控制片段以进行执行。在这种情况下，用于学习Q网络调度程序的深度Q学习算法可能适用于使用奖励函数，该函数更喜欢原始控制器序列，而探索策略则赋予序列内控制片段多于序列外控制片段的机会。序列控制片段。这种经过改进的Q学习算法学习调度器，该调度器在大多数情况下能够遵循原始控制器序列，同时在必要时选择失序的控制片段。

著录项

公开/公告号US2018089553A1

专利类型
公开/公告日2018-03-29

原文格式PDF
申请/专利权人 DISNEY ENTERPRISES INC.;
展开▼

申请/专利号US201615277872
发明设计人 JESSICA KATE HODGINS;LIBIN LIU;
展开▼

申请日2016-09-27
分类号G06N3/00;G06N3/08;G06N3/04;
国家 US
入库时间 2022-08-21 13:01:27

相似文献

专利
外文文献
中文文献