首页> 外文会议>IEEE International Conference on Multimedia and Expo >Twice Opportunity Knocks Syntactic Ambiguity: A Visual Question Answering Model with yeso Feedback
【24h】

Twice Opportunity Knocks Syntactic Ambiguity: A Visual Question Answering Model with yeso Feedback

机译:两次机会敲响句法歧义:是/否反馈的视觉问题回答模型

获取原文

摘要

Visual Question Answering (VQA) is a joint task that aims to answer questions based on given images. During dialogs between humans, syntactic ambiguity is a common phenomenon and it also could be found in the questions of VQA systems. Generally, the existing methods for VQA utilize one-shot answering frameworks, which will face a great difficulty if syntactic ambiguity occurs in questions. In human dialogs, people often conquer the problem by feeding back questions for confirmation. Inspired by this observation, we propose a novel method to eliminate the syntactic ambiguity in VQA via the user's feedback. We compared our method with the existing methods on two benchmark datasets, CLEVR and CLEVR-CoGenT. We found that the accuracy of our method is close to 100% on the CLEVR dataset. On the CLEVR-CoGenT dataset, our method is also 21% higher than the state-of-the-art method.
机译:视觉问题解答(VQA)是一项共同任务,旨在根据给定的图像回答问题。在人与人之间的对话中,句法歧义是一种常见现象,也可以在VQA系统问题中找到。通常,现有的VQA方法使用一次性回答框架,如果问题中出现语法歧义,将面临很大的困难。在人际对话中,人们通常通过反馈问题进行确认来征服问题。受此观察的启发,我们提出了一种通过用户反馈消除VQA中句法歧义的新颖方法。我们在两个基准数据集CLEVR和CLEVR-CoGenT上将我们的方法与现有方法进行了比较。我们发现,在CLEVR数据集上,我们方法的准确性接近100%。在CLEVR-CoGenT数据集上,我们的方法也比最新方法高21%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号