Twice Opportunity Knocks Syntactic Ambiguity: A Visual Question Answering Model with yeso Feedback

机译：两次机会敲响句法歧义：是/否反馈的视觉问题回答模型

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Visual Question Answering (VQA) is a joint task that aims to answer questions based on given images. During dialogs between humans, syntactic ambiguity is a common phenomenon and it also could be found in the questions of VQA systems. Generally, the existing methods for VQA utilize one-shot answering frameworks, which will face a great difficulty if syntactic ambiguity occurs in questions. In human dialogs, people often conquer the problem by feeding back questions for confirmation. Inspired by this observation, we propose a novel method to eliminate the syntactic ambiguity in VQA via the user's feedback. We compared our method with the existing methods on two benchmark datasets, CLEVR and CLEVR-CoGenT. We found that the accuracy of our method is close to 100% on the CLEVR dataset. On the CLEVR-CoGenT dataset, our method is also 21% higher than the state-of-the-art method.

机译：视觉问题解答（VQA）是一项共同任务，旨在根据给定的图像回答问题。在人与人之间的对话中，句法歧义是一种常见现象，也可以在VQA系统问题中找到。通常，现有的VQA方法使用一次性回答框架，如果问题中出现语法歧义，将面临很大的困难。在人际对话中，人们通常通过反馈问题进行确认来征服问题。受此观察的启发，我们提出了一种通过用户反馈消除VQA中句法歧义的新颖方法。我们在两个基准数据集CLEVR和CLEVR-CoGenT上将我们的方法与现有方法进行了比较。我们发现，在CLEVR数据集上，我们方法的准确性接近100％。在CLEVR-CoGenT数据集上，我们的方法也比最新方法高21％。

著录项

来源
《IEEE International Conference on Multimedia and Expo》|2019年|736-741|共6页
会议地点
作者
Jianming Wang; Wei Deng; Yukuan Sun; Yuanyuan Li; Kai Wang; Guanghao Jin;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Syntactics; Visualization; Feature extraction; Task analysis; Linguistics; Knowledge discovery; Layout;

机译：句法;可视化;特征提取;任务分析;语言学;知识发现;布局;

相似文献

外文文献
中文文献
专利

1. Object sequences: encoding categorical and spatial information for a yeso visual question answering task [J] . Shivam Garg, Rajeev Srivastava Computer Vision, IET . 2018,第8期

机译：对象序列：对分类和空间信息进行编码，以执行是/否视觉问题回答任务
2. Visual question answering via Attention-based syntactic structure tree-LSTM [J] . Liu Yun, Zhang Xiaoming, Huang Feiran, Applied Soft Computing . 2019,第期

机译：通过基于关注的句法结构树-LSTM的视觉问题回答
3. Differences in Reaction to Immediate Feedback and Opportunity to Revise Answers for Multiple-Choice and Open-Ended Questions [J] . Attali Yigal, Laitusis Cara, Stone Elizabeth Educational and Psychological Measurement . 2016,第5期

机译：对即时反馈的反应差异以及对多项选择题和开放式问题的答案进行修改的机会
4. Twice Opportunity Knocks Syntactic Ambiguity: A Visual Question Answering Model with yes/no Feedback [C] . Jianming Wang, Wei Deng, Yukuan Sun, IEEE International Conference on Multimedia and Expo . 2019

机译：两次机会敲句法模糊：一个视觉问题的回答模型，是/否反馈
5. An Analysis of Bottom-Up Attention Models and Multimodal Representation Learning for Visual Question Answering [D] . Narayanan, Venkatraman . 2019

机译：视觉问题应答的自下而上关注模型和多式联表学习分析
6. Differences in Reaction to Immediate Feedback and Opportunity to Revise Answers for Multiple-Choice and Open-Ended Questions [O] . Yigal Attali, Cara Laitusis, Elizabeth Stone 2016

机译：对即时反馈的反应差异以及对多项选择题和开放式问题的答案进行修改的机会
7. Differences in Reaction to Immediate Feedback and Opportunity to Revise Answers for Multiple-Choice and Open-Ended Questions [O] . Yigal Attali, Cara Laitusis, Elizabeth Stone 2016

机译：对即时反馈的反应和修改多项选择和开放式问题的答案的机会的差异

Twice Opportunity Knocks Syntactic Ambiguity: A Visual Question Answering Model with yeso Feedback

摘要

著录项

相似文献

相关主题

期刊订阅