首页> 外文会议>IEEE Conference on Computer Vision and Pattern Recognition >Answer-Type Prediction for Visual Question Answering
【24h】

Answer-Type Prediction for Visual Question Answering

机译:视觉问题回答的答案类型预测

获取原文

摘要

Recently, algorithms for object recognition and related tasks have become sufficiently proficient that new vision tasks can now be pursued. In this paper, we build a system capable of answering open-ended text-based questions about images, which is known as Visual Question Answering (VQA). Our approach's key insight is that we can predict the form of the answer from the question. We formulate our solution in a Bayesian framework. When our approach is combined with a discriminative model, the combined model achieves state-of-the-art results on four benchmark datasets for open-ended VQA: DAQUAR, COCO-QA, The VQA Dataset, and Visual7W.
机译:近来,用于对象识别和相关任务的算法已经变得足够熟练,以使得现在可以追求新的视觉任务。在本文中,我们构建了一个能够回答有关图像的开放式基于文本的问题的系统,称为视觉问题解答(VQA)。我们的方法的主要见解是,我们可以根据问题预测答案的形式。我们在贝叶斯框架中制定解决方案。当我们的方法与判别模型结合使用时,结合的模型可以在开放式VQA的四个基准数据集上获得最新的结果:DAQUAR,COCO-QA,VQA数据集和Visual7W。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号