首页> 外文会议>International Conference on Neural Information Processing >TAC-GAIL: A Multi-modal Imitation Learning Method
【24h】

TAC-GAIL: A Multi-modal Imitation Learning Method

机译:TAC-GAIL:一种多模态仿制方法

获取原文

摘要

Imitation learning provides a family of promising frameworks that learn policies from expert demonstrations directly. However, most imitation learning methods assume that the expert demonstrations come from the same expert and have a single modality. In fact, the expert demonstrations may be generated by different experts in different modalities. Auxiliary classifier generative adversarial imitation learning (AC-GAIL) uses an auxiliary classifier to classify samples according to modalities, so that the generator can perform different actions according to different modalities, and obtain a multi-modal policy. However, we find that AC-GAIL's objective function missing a conditional entropy, and this conditional entropy cannot be calculated directly. Missing the conditional entropy can result in a decrease in the performance of the learned policy. In this paper, we propose a method that can deal with the problem of missing conditional entropy in AC-GAIL, named twin auxiliary classifiers GAIL (TAC-GAIL). Specifically, we add another auxiliary classifier to the framework of AC-GAIL, which is used to classify the generated samples. We theoretically prove the effectiveness of this method, and the experimental results on MuJoCo tasks show that TAC-GAIL can effectively improve the performance of the learned multi-modal policy.
机译:仿制学习提供了一系列有希望的框架,可以直接从专家演示中学习政策。但是,大多数仿制学习方法假设专家演示来自同一专家并具有单一的方式。事实上,专家演示可以由不同方式的不同专家产生。辅助分类器生成的对手模仿学习(AC-Gail)使用辅助分类器根据模态对样本进行分类,使得发电机可以根据不同的模态执行不同的动作,并获得多模态策略。但是,我们发现AC-Gail的目标函数缺少条件熵,并且无法直接计算这种条件熵。缺少条件熵可能导致学习政策的性能下降。在本文中,我们提出了一种方法,可以解决AC-Gail中缺少条件熵的问题,名为Twin辅助分类器Gail(TAC-Gail)。具体而言,我们将另一个辅助分类器添加到AC-GAIL的框架中,用于对所生成的样本进行分类。我们理论上证明了这种方法的有效性,对Mujoco任务的实验结果表明,TAC-Gail可以有效地提高学习的多模态政策的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号