首页> 外文会议>Annual conference on Neural Information Processing Systems >Generating Long-term Trajectories Using Deep Hierarchical Networks
【24h】

Generating Long-term Trajectories Using Deep Hierarchical Networks

机译:使用深度层次网络生成长期轨迹

获取原文

摘要

We study the problem of modeling spatiotemporal trajectories over long time horizons using expert demonstrations. For instance, in sports, agents often choose action sequences with long-term goals in mind, such as achieving a certain strategic position. Conventional policy learning approaches, such as those based on Markov decision processes, generally fail at learning cohesive long-term behavior in such high-dimensional state spaces, and are only effective when fairly myopic decisionmaking yields the desired behavior. The key difficulty is that conventional models are "single-scale" and only learn a single state-action policy. We instead propose a hierarchical policy class that automatically reasons about both long-term and short-term goals, which we instantiate as a hierarchical neural network. We showcase our approach in a case study on teaming to imitate demonstrated basketball trajectories, and show that it generates significantly more realistic trajectories compared to non-hierarchical baselines as judged by professional sports analysts.
机译:我们使用专家论证研究了长时间内时空轨迹建模的问题。例如,在体育运动中,代理人通常会选择具有长期目标的行动顺序,例如实现某个战略地位。传统的策略学习方法(例如基于Markov决策过程的策略学习方法)通常无法学习此类高维状态空间中的内聚性长期行为,并且仅在相当近视的决策产生所需行为时才有效。关键的困难在于常规模型是“单尺度”的,并且仅学习单个状态行为策略。相反,我们提出了一种层次结构的策略类,该类自动考虑长期和短期目标,我们将其实例化为层次神经网络。我们在团队合作的案例研究中展示了我们的方法,以模仿已证明的篮球轨迹,并表明与专业体育分析师判断的非分层基线相比,该方法可产生更现实的轨迹。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号