首页> 外文会议>IEEE/CVF Conference on Computer Vision and Pattern Recognition >Are You Talking to Me? Reasoned Visual Dialog Generation Through Adversarial Learning

Are You Talking to Me? Reasoned Visual Dialog Generation Through Adversarial Learning




The visual dialog task requires an agent to engage in a conversation about an image with a human. It represents an extension of the visual question answering task in that the agent needs to answer a question about an image, but it needs to do so in light of the previous dialog that has taken place. The key challenge in visual dialog is thus maintaining a consistent, and natural dialog while continuing to answer questions correctly. We present a novel approach that combines Reinforcement Learning and Generative Adversarial Networks (GANS) to generate more human-like responses to questions. The GAN helps overcome the relative paucity of training data, and the tendency of the typical MLE-based approach to generate overly terse answers. Critically, the GAN is tightly integrated into the attention mechanism that generates human-interpretable reasons for each answer. This means that the discriminative model of the GAN has the task of assessing whether a candidate answer is generated by a human or not, given the provided reason. This is significant because it drives the generative model to produce high quality answers that are well supported by the associated reasoning. The method also generates the state-of-the-art results on the primary benchmark.
机译:可视对话任务需要代理与人进行有关图像的对话。它代表了可视问题解答任务的扩展,因为座席需要回答有关图像的问题,但是它需要根据先前发生的对话框来回答。因此,视觉对话中的关键挑战是保持一致,自然的对话,同时继续正确回答问题。我们提出了一种结合强化学习和生成对抗网络(GANS)的新颖方法,以生成更多类似于人的问题答案。 GAN有助于克服训练数据的相对不足,以及克服基于MLE的典型方法生成过于简洁的答案的趋势。至关重要的是,GAN已紧密集成到注意力机制中,该机制为每个答案生成了人类可以解释的原因。这意味着,在给出所提供的原因的情况下,GAN的判别模型具有评估候选答案是否由人产生的任务。这很重要,因为它驱动生成模型产生高质量的答案,并得到相关推理的充分支持。该方法还可以在主要基准上生成最新的结果。



  • 外文文献
  • 中文文献
  • 专利


京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号