Semantic Reanalysis of Scene Words in Visual Question Answering

机译：视觉问题应答中场景词语的语义重新分析

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

Visual Question Answering (VQA) is a joint task that aims to answer questions based on the given images. The correct analysis of multiple album aggregate issues to remain a key issue in the VQA case, especially when answering question from multiple albums, how to correctly understand album images and corresponding question is an urgent problem. Under the influence of multiple photo albums and the presence of scene words in the question, it may lead to understanding the wrong scene and outputting the wrong answer, resulting in a decrease in VQA performance. In order to solve this problem, this paper proposes a new image and sentence similarity matching model, which outputs the correct image representation by learning the semantic concept. Due to the scene word is not an entity, sometimes the information which the model extracted may be incorrect. Therefore, we can try to reanalyse the question in another different way and give the answer by the similarity between the question and the visual-text. Our model was tested on the MemexQA dataset. The experimental results show that our model not only produces meaningful text sentences to prove the correctness of the answer, but also improves the accuracy by nearly 10%.

机译：视觉问题应答（VQA）是一个联合任务，旨在根据给定的图像回答问题。对多个专辑汇总问题的正确分析仍然是VQA案例中的关键问题，特别是在从多个专辑回答问题时，如何正确了解专辑图像和相应的问题是一个紧急问题。在多个相册的影响和问题中存在场景词的存在，它可能导致了解错误的场景并输出错误的答案，导致VQA性能下降。为了解决这个问题，本文提出了一种新的图像和句子相似性匹配模型，其通过学习语义概念来输出正确的图像表示。由于场景单词不是实体，有时提取模型的信息可能是不正确的。因此，我们可以尝试以另一种不同的方式重新分析这个问题，并通过问题与视觉文本之间的相似性来答案。我们的模型在MemexQA数据集上进行了测试。实验结果表明，我们的模型不仅会产生有意义的文本句，以证明答案的正确性，但也提高了近10％的准确性。

著录项

来源
《Chinese conference on pattern recognition and computer vision》|2019年|xxx 629 p.|共12页
会议地点
作者
Shiling Jiang; Ming Ma; Jianming Wang; Jiayu Liang; Kunliang Liu; Yukuan Sun; Wei Deng; Siyu Ren; Guanghao Jin;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类计算技术、计算机技术;
关键词
Visual question answering; Similarity match; Scene words; Computer vision; Natural language processing;

机译：视觉问题应答;相似性匹配;场景词;计算机愿景;自然语言处理;
入库时间 2022-08-20 23:53:16

相似文献

外文文献
中文文献
专利

1. Computing Word Semantic Relatedness for Question Retrieval in Community Question Answering [J] . Jung-Tae LEE, Young-In SONG, Hae-Chang RIM IEICE Transactions on Information and Systems . 2009,第4期

机译：计算词语义相关性以解决社区提问中的问题
2. Meanings are more than just words: A Cross-Domain Question Answering Tool based on Unsupervised Semantic Feature Learning [J] . Nripa Chetry, Debanjan Choudhary, Arindam Chatterjee, International Journal of Computer Trends and Technology . 2017,第2期

机译：含义不仅仅是单词：基于无监督语义特征学习的跨域问答工具
3. R-VQA: Learning Visual Relation Facts with Semantic Attention for Visual Question Answering [J] . Pan Lu, Lei Ji, Wei Zhang, SIGKDD explorations . 2018,第Udisk期

机译：R-VQA：学习具有语义关注的视觉关系事实，用于视觉问题应答
4. Semantic Reanalysis of Scene Words in Visual Question Answering [C] . Shiling Jiang, Ming Ma, Jianming Wang, Chinese conference on pattern recognition and computer vision . 2019

机译：视觉问答中场景词的语义重新分析
5. Long-answer question answering and rhetorical-semantic relations. [D] . Blair-Goldensohn, Sasha J. 2007

机译：长答案问题解答和修辞语义关系。
6. More Questions than Answers: Continued Critical Reanalysis of Fredrickson et al.’s Studies of Genomics and Well-Being [O] . Nicholas J. L. Brown, Douglas A. MacDonald, Manoj P. Samanta, -1

机译：问题多于答案：对Fredrickson等人的基因组学和幸福感研究的持续批判性再分析
7. Natural scene classification, annotation and retrieval. Developing different approaches for semantic scene modelling based on Bag of Visual Words. [O] . Alqasrawi Yousef T. N. 2012

机译：自然场景分类，注释和检索。开发基于视觉单词袋的语义场景建模的不同方法。

Semantic Reanalysis of Scene Words in Visual Question Answering

摘要

著录项

相似文献

相关主题

期刊订阅