首页> 外文会议>International Conference on Artificial Neural Networks >Neural Networks for Detecting Irrelevant Questions During Visual Question Answering
【24h】

Neural Networks for Detecting Irrelevant Questions During Visual Question Answering

机译:在视觉问题应答期间检测无关题的神经网络

获取原文

摘要

Visual question answering (VQA) is a task to produce correct answers to questions about images. When given an irrelevant question to an image, existing models for VQA will still produce an answer rather than predict that the question is irrelevant. This situation shows that current VQA models do not truly understand images and questions. On the other hand, producing answers for irrelevant questions can be misleading in real-world application scenarios. To tackle this problem, we hypothesize that the abilities required for detecting irrelevant questions are similar to those required for answering questions. Based on this hypothesis, we study what performance a state-of-the-art VQA network can achieve when trained on irrelevant question detection. Then, we analyze the influences of reasoning and relational modeling on the task of irrelevant question detection. Our experimental results indicate that a VQA network trained on an irrelevant question detection dataset outperforms existing state-of-the-art methods by a big margin on the task of irrelevant question detection. Ablation studies show that explicit reasoning and relational modeling benefits irrelevant question detection. At last, we investigate a straight-forward idea of integrating the ability to detect irrelevant questions into VQA models by joint training with extended VQA data containing irrelevant cases. The results suggest that joint training has a negative impact on the model's performance on the VQA task, while the accuracy on relevance detection is maintained. In this paper we claim that an efficient neural network designed for VQA can achieve high accuracy on detecting relevance, however integrating the ability to detect relevance into a VQA model by joint training will lead to degradation of performance on the VQA task.
机译:视觉问题应答(VQA)是一个为关于图像问题产生正确答案的任务。当给予图像的无关紧要的问题时,VQA的现有模型仍将产生答案而不是预测问题是无关紧要的。这种情况表明,目前的VQA模型不会真正理解图像和问题。另一方面,在现实世界应用场景中产生无关的问题的答案可能会误导。为了解决这个问题,我们假设检测无关紧要问题所需的能力与回答问题所需的能力类似。基于这一假设,我们研究了最先进的VQA网络在无关的问题检测上培训时实现的性能。然后,我们分析了推理与关系建模对无关的问题检测任务的影响。我们的实验结果表明,在无关的问题检测数据集上培训的VQA网络优于现有的最先进的方法,通过对无关的问题检测任务的大幅度。消融研究表明,显式推理和关系建模益处无关的问题检测。最后,我们通过联合培训通过联合培训来整合将无关题检测到VQA模型中的能力的直接想法,其中包含含有无关案件的扩展VQA数据。结果表明,联合培训对模型对VQA任务的表现产生负面影响,而相关性检测的准确性则得到了负面影响。在本文中,我们声称设计用于VQA的高效神经网络可以在检测相关性上实现高精度,但是通过联合训练将检测与VQA模型相关的能力将导致VQA任务的性能降低。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号