首页> 外文会议>IEEE Winter Conference on Applications of Computer Vision >Semantically Guided Visual Question Answering
【24h】

Semantically Guided Visual Question Answering

机译:语义引导的视觉问题解答

获取原文

摘要

We present a novel approach to enhance the challenging task of Visual Question Answering (VQA) by incorporating and enriching semantic knowledge in a VQA model. We first apply Multiple Instance Learning (MIL) to extract a richer visual representation addressing concepts beyond objects such as actions and colors. Motivated by the observation that semantically related answers often appear together in prediction, we further develop a new semantically-guided loss function for model learning which has the potential to drive weakly-scored but correct answers to the top while suppressing wrong answers. We show that these two ideas contribute to performance improvement in a complementary way. We demonstrate competitive results comparable to the state of the art on two VQA benchmark datasets.
机译:我们提出了一种新颖的方法,通过在VQA模型中纳入和丰富语义知识来增强具有挑战性的视觉问题解答(VQA)的任务。我们首先应用多实例学习(MIL)来提取更丰富的视觉表示,以解决诸如动作和颜色之类的对象之外的概念。由于观察到语义相关的答案经常在预测中同时出现,因此我们进一步开发了一种新的语义指导的损失函数用于模型学习,该函数可以将得分较低但正确的答案推到顶部,同时抑制错误的答案。我们显示这两个想法以互补的方式有助于提高性能。我们在两个VQA基准数据集上展示了与最新技术水平相当的竞争结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号