Dialogue Generation: From Imitation Learning to Inverse Reinforcement Learning

机译：对话一代：从模仿学习到逆钢筋学习

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

The performance of adversarial dialogue generation models relies on the quality of the reward signal produced by the discriminator. The reward signal from a poor discriminator can be very sparse and unstable, which may lead the generator to fall into a local optimum or to produce nonsense replies. To alleviate the first problem, we first extend a recently proposed adversarial dialogue generation method to an adversarial imitation learning solution. Then, in the framework of adversarial inverse reinforcement learning, we propose a new reward model for dialogue generation that can provide a more accurate and precise reward signal for generator training. We evaluate the performance of the resulting model with automatic metrics and human evaluations in two annotation settings. Our experimental results demonstrate that our model can generate more high-quality responses and achieve higher overall performance than the state-of-the-art.

机译：对抗对话生成模型的性能依赖于鉴别器产生的奖励信号的质量。来自差别判别器的奖励信号可以非常稀疏和不稳定，这可能导致发电机落入本地最佳或产生废话的回复。为了缓解第一个问题，我们首先将最近提出的对抗对话发电方法扩展到对抗仿制学习解决方案。然后，在对抗逆钢筋学习的框架中，我们提出了一种用于对话生成的新奖励模型，可以为发电机训练提供更准确和精确的奖励信号。我们在两个注释设置中评估生成模型的性能和自动指标和人为评估。我们的实验结果表明，我们的模型可以产生更高质量的响应，并实现比最先进的整体性能更高。

著录项

来源
《AAAI Conference on Artificial Intelligence》|2019年|6187-7062p|共8页
会议地点
作者
Ziming Li; Julia Kiseleva; Maarten de Rijke;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类 TP18-53;
关键词

相似文献

外文文献
中文文献
专利

1. Bridging the Gap Between Imitation Learning and Inverse Reinforcement Learning [J] . Bilal Piot, Matthieu Geist, Olivier Pietquin Neural Networks and Learning Systems, IEEE Transactions on . 2017,第8期

机译：弥合模仿学习与反强化学习之间的鸿沟
2. Advanced planning for autonomous vehicles using reinforcement learning and deep inverse reinforcement learning [J] . You Changxi, Lu Jianbo, Filev Dimitar, Robotics and Autonomous Systems . 2019,第期

机译：利用强化学习和深度逆钢筋学习的自治车辆先进规划
3. Cloud Resource Scheduling With Deep Reinforcement Learning and Imitation Learning [J] . Guo Wenxia, Tian Wenhong, Ye Yufei, Internet of Things Journal, IEEE . 2021,第5期

机译：云资源调度与深增强学习和模仿学习
4. Dialogue Generation: From Imitation Learning to Inverse Reinforcement Learning [C] . Ziming Li, Julia Kiseleva, Maarten de Rijke AAAI Conference on Artificial Intelligence . 2019

机译：对话一代：从模仿学习到逆钢筋学习
5. Min-Max Inverse Reinforcement Learning for Learning Bi-Modal Dialogue Policies [D] . Patil, Gandharv. 2020

机译：用于学习双模对话策略的最大最大逆钢筋学习
6. Learning for a Robot: Deep Reinforcement Learning Imitation Learning Transfer Learning [O] . Jiang Hua, Liangcai Zeng, Gongfa Li, 2021

机译：学习机器人：深增强学习仿制学习转移学习
7. Bridging the Gap Between Imitation Learning and Inverse Reinforcement Learning [O] . Piot, Bilal, Geist, Matthieu, Pietquin, Olivier 2017

机译：弥合模仿学习与反强化学习之间的鸿沟

Dialogue Generation: From Imitation Learning to Inverse Reinforcement Learning

摘要

著录项

相似文献

相关主题

期刊订阅